OPTIMIZING SEARCH FUNCTIONALITY: A PERFORMANCE COMPARISON BETWEEN SOLR AND ELASTICSEARCH

Authors

  • Sudeesh Goriparthi Senior software engineer, software architecture, Walmart, Dallas, USA. Author

Keywords:

Apache Solr, Elasticsearch, Low-powered

Abstract

The importance of data has never been higher than it is currently of big data since it conceals previously unknown insights. However, extracting information that may be used from massive amounts of data is not only required but also difficult. Developers of data-intensive systems have consequently encountered several obstacles while attempting to carry out data processing and analytics in a range of environments. One of the most crucial parts of big data analytics and processing is full-text search, which helps to unearth crucial data points concealed in enormous datasets. The paper began by comparing full-text search technologies, their features, and technical aspects. Then, it compared Apache Solr and Elasticsearch on three separate datasets, looking at indexing speeds and queries. The topic's importance warranted these measures. Based on our analysis of indexing times recorded on three separate workstations with different hardware specifications, we can conclude that Apache Solr performs better with its default configuration. Apache Solr outperforms Elasticsearch in every single case when it comes to query performance. After reviewing the results, we advise using Apache Solr instead of Elasticsearch on low-powered computers. This study's results offer data-intensive system researchers and developers a thorough comparison and some pointers on how to choose the best fulltext search engine for their needs.

 

References

F. Ohhorst, “Turning Big Data Into Big Money”, Big Data Analytics, , New Jersey, AB.D., 2013.

Science Clouds., https://portal.futuregrid.org/, Last Access : 13.07.2016

S. Ramamorthy, S. Rajalakshmi, “Optimized Data Analysis in Cloud using BigData Analytics Techniques,” 4th ICCCNT Conferense, Tiruchengode, India, 2013.

C. Yeşilkaya, “Apache Solr Kurulumu”, https://blog.kodcu.com/2013/03/apache-solr-kurulumuornek-sorgulama/ Last Access : 13.07.2016.

DB-Engines Ranking of Search Engines, http://dbengines.com/en/ranking/search+engine, Last Access : 13.07.2016

Halevi, G., & Moed, H. (2012). The evolution of big data as a research and scientific topic: overview of the literature. Research Trends, 30(36), 3–6.

Domo Company. (2022). Data Never Sleeps 9.0. https://www.domo.com/learn/infographic/data-never-sleeps-9

Lashkaripour, Z. (2020). The era of big data: a thorough inspection in the building blocks of future generation data management. International Journal of Scientific and Technology Research, 9, 321–330.

Rao, T. R., Mitra, P., Bhatt, R., & Goswami, A. (2018). The big data system, components, tools, and technologies: a survey. Knowledge and Information Systems, 60(3), 1165. https://doi.org/10.1007/s10115-018-1248-0

Barrenechea, M., Jambi, S., Aydin, A. A., Hakeem, M., & Anderson, K. M. (2017). Getting the query right for crisis informatics design issues for web-based analysis environments. Journal of Web Engineering, 16(5), 399–432. https://journals.riverpublishers.com/index.php/JWE/article/view/3269/2153

Anderson, K. M., Aydin, A. A., Barrenechea, M., Cardenas, A., Hakeem, M., & Jambi, S. (2015). Design challenges/solutions for environments supporting the analysis of social media data in crisis informatics research. 2015 48th Hawaii International Conference on System Sciences, 2015-March, 163–172. https://doi.org/10.1109/HICSS.2015.29

Wang, J.-F., Wang, X.-F., & Li, H. (2022). Design of multimedia distance teaching auxiliary system based on MOOC platform. ICMTMA 2022 - 14th International Conference on Measuring Technology and Mechatronics Automation, 1179–1186. https://doi.org/10.1109/ICMTMA54903.2022.00237

Y. Aldailamy, A., Abdul Hamid, N. A. W., & Abdulkarem, M. (2018). Distributed indexing: performance analysis of Solr, Terrier and Katta information retrievals. Malaysian Journal of Computer Science, 87–104. https://doi.org/10.22452/mjcs.sp2018no1.7

Bellini, P., Bugli, F., Nesi, P., Pantaleo, G., Paolucci, M., & Zaza, I. (2019). Data flow management and visual analytic for big data smart city/IOT. Proceedings - 2019 IEEE SmartWorld, Ubiquitous Intelligence and Computing, Advanced and Trusted Computing, Scalable Computing and Communications, Internet of People and Smart City Innovation, SmartWorld/UIC/ATC/SCALCOM/IOP/SCI 2019, 1529–1536. https://doi.org/10.1109/SmartWorld-UIC-ATC-SCALCOM-IOP-SCI.2019.00276

Oussous, A., & Benjelloun, F. (2022). A comparative study of different search and indexing tools for big data. Jordanian Journal of Computers and Information Technology, 8(1), 1. https://doi.org/10.5455/jjcit.71-1637097759

Elasticsearch vs. Solr Performance: Round 2. (2015). https://www.flax.co.uk/blog/2015/12/02/elasticsearch-vs-solrperformance-round-2/

Gonçalves, A. A. S., & Sunye, M. S. (2020). Comparison of search servers for use with digital repositories. ICEIS 2020 - Proceedings of the 22nd International Conference on Enterprise Information Systems, 1, 256–260. https://doi.org/10.5220/0009577102560260

Yurtsever, M. M. E., Özcan, M., Taruz, Z., Eken, S., & Sayar, A. (2022). Figure search by text in large scale digital document collections. Concurrency and Computation: Practice and Experience, 34(1). https://doi.org/10.1002/CPE.6529

Downloads

Published

2023-08-08

How to Cite

OPTIMIZING SEARCH FUNCTIONALITY: A PERFORMANCE COMPARISON BETWEEN SOLR AND ELASTICSEARCH. (2023). INTERNATIONAL JOURNAL OF DATA ANALYTICS (IJDA), 3(1), 67-78. https://lib-index.com/index.php/IJDA/article/view/IJDA_03_01_006