SERVING MASSIVE CATALOG DATA AT SCALE WITH VERY HIGH AVAILABILITY AND LOW LATENCY

Authors

  • Suraj Modi Uber Technologies Inc, USA. Author
  • Preetham Vemasani Uber Technologies Inc, USA. Author

Keywords:

E-Commerce Catalog Data, Data Chunking And Retrieval, Serialization With Flatbuffers, Caching Strategies, Exponential Backoff For Availability

Abstract

This article explores the challenges and techniques involved in efficiently serving large amounts of catalog data in e-commerce platforms. It discusses the limitations of traditional databases, such as row size constraints, and presents various approaches to overcome these challenges. The article delves into data chunking and retrieval techniques, efficient serialization using Flatbuffers, caching strategies, and exponential backoff for enhanced availability. It draws upon research studies, industry examples, and performance comparisons to highlight the effectiveness of these techniques in improving response times, reducing data inconsistencies, and ensuring high availability in e-commerce systems.

References

Statista, "Retail e-commerce sales worldwide from 2014 to 2024," 2021. [Online]. Available: https://www.statista.com/statistics/379046/worldwide-retail-e-commerce-sales/

J. Smith, "Scalability Challenges in E-commerce Databases," Journal of Big Data, vol. 3, no. 2, pp. 123-135, 2019.

MySQL, "MySQL 8.0 Reference Manual: Limits on Table Column Count and Row Size," Oracle Corporation, 2021. [Online]. Available: https://dev.mysql.com/doc/refman/8.0/en/column-count-limit.html

R. Agarwal, M. Gupta, and V. Sharma, "Improving E-commerce Platform Performance through Data Partitioning and Indexing," in Proc. IEEE International Conference on E-commerce Technology (CEC), 2020, pp. 23-30.

Y. Liu, H. Wang, and L. Chen, "A Survey on Caching Mechanisms in E-commerce Systems," Journal of Web Engineering, vol. 18, no. 5, pp. 456-478, 2021.

Scrapehero, "How Many Products Does Amazon Sell? – January 2021," 2021. [Online]. Available: https://www.scrapehero.com/how-many-products-does-amazon-sell-january-2021/

A. Cockroft, "Lessons Learned from Scaling Amazon's Catalog Service," in Proc. ACM Symposium on Cloud Computing (SoCC), 2019, pp. 1-2.

Amazon, "Amazon.com Announces Record-Breaking Holiday Sales," Amazon Press Release, 2020. [Online]. Available: https://press.aboutamazon.com/news-releases/news-release-details/amazoncom-announces-record-breaking-holiday-sales-1

Statista, "Online shoppers worldwide from 2014 to 2024," 2021. [Online]. Available: https://www.statista.com/statistics/251666/number-of-digital-buyers-worldwide/

A. Patel and B. Chen, "Efficient Data Chunking Techniques for Large-Scale Databases," in Proc. IEEE International Conference on Data Engineering (ICDE), 2020, pp. 456-467.

MySQL, "MySQL 8.0 Reference Manual: Limits on Table Column Count and Row Size," Oracle Corporation, 2021. [Online]. Available: https://dev.mysql.com/doc/refman/8.0/en/column-count-limit.html

M. Gupta, R. Agarwal, and S. Srinivasan, "Evaluating the Impact of Data Chunking on Query Performance in E-commerce Systems," Journal of Database Management, vol. 29, no. 3, pp. 1-15, 2022.

M. Johnson, "Ensuring Data Consistency with Two-Phase Commit," Database Systems Journal, vol. 8, no. 3, pp. 78-92, 2021.

R. Singh, A. Patel, and V. Sharma, "Improving Data Consistency in E-commerce Platforms using Two-Phase Commit," in Proc. IEEE International Conference on E-commerce Technology (CEC), 2023, pp. 101-108.

L. Wei, H. Jiang, and K. Li, "Asynchronous Two-Phase Commit Protocol for Distributed Transactions in E-commerce Systems," Journal of Parallel and Distributed Computing, vol. 143, pp. 23-31, 2020.

Google, "Flatbuffers: A Memory-Efficient Serialization Library," Google Open Source Blog, 2015. [Online]. Available: https://opensource.googleblog.com/2015/03/flatbuffers-memory-efficient.html

Flatbuffers, "Flatbuffers Documentation: Writing a Schema," Google, 2021. [Online]. Available: https://google.github.io/flatbuffers/flatbuffers_guide_writing_schema.html

S. Lee and T. Kim, "Optimizing Data Transfer with Flatbuffers in E-commerce Systems," Journal of Web Engineering, vol. 17, no. 4, pp. 321-336, 2022.

K. Nagaraj, C. Killian, and J. Neville, "Structured Comparative Analysis of Systems Logs to Diagnose Performance Problems," in Proc. USENIX Symposium on Networked Systems Design and Implementation (NSDI), 2012, pp. 353-366.

X. Zhao, Y. Zhang, and J. Wang, "Leveraging Flatbuffers for Efficient Data Serialization in Alibaba's Catalog Service," in Proc. ACM Symposium on Cloud Computing (SoCC), 2021, pp. 301-313.

Flatbuffers, "Flatbuffers Documentation: Schema Evolution," Google, 2021. [Online]. Available: https://google.github.io/flatbuffers/flatbuffers_guide_schema_evolution.html

R. Davis, "Caching Strategies for High-Performance Data Serving," in Proc. ACM International Conference on Web Search and Data Mining (WSDM), 2018, pp. 189-197.

M. Gupta, A. Patel, and S. Kumar, "Evaluating the Impact of Caching on E-commerce Application Performance," Journal of Web Systems and Technologies, vol. 6, no. 2, pp. 120-135, 2021.

L. Wang and H. Chen, "Efficient Cache Invalidation Techniques for Dynamic Data," Journal of Computer Science and Technology, vol. 26, no. 5, pp. 789-802, 2023.

S. Lee, J. Kim, and E. Hwang, "Time-based Cache Invalidation Strategies for E-commerce Systems," in Proc. IEEE International Conference on E-commerce Technology (CEC), 2022, pp. 212-219.

K. Singh and R. Agarwal, "Event-driven Cache Invalidation for Consistency Maintenance in E-commerce Platforms," in Proc. ACM Symposium on Applied Computing (SAC), 2024, pp. 1456-1463.

Redis, "Redis Pub/Sub," Redis Labs, 2021. [Online]. Available: https://redis.io/topics/pubsub

A. Cockroft, "Lessons Learned from Scaling Amazon's Catalog Service," in Proc. ACM Symposium on Cloud Computing (SoCC), 2019, pp. 1-2.

K. Patel, "Enhancing Availability with Exponential Backoff in Distributed Systems," in Proc. IEEE International Conference on Cloud Computing (CLOUD), 2019, pp. 312-320.

[30] M. Nygard, "Release It!: Design and Deploy Production-Ready Software," Pragmatic Bookshelf, 2018, pp. 123-132.

A. Chandra, A. Gupta, and J. Hennessy, "Effectiveness of Exponential Backoff in Improving System Availability," in Proc. IEEE International Conference on Dependable Systems and Networks (DSN), 2020, pp. 456-465.

S. Newman, "Building Microservices: Designing Fine-Grained Systems," O'Reilly Media, 2021, pp. 189-203.

A. Tseitlin, "The Netflix Simian Army," Netflix Technology Blog, 2011. [Online]. Available: https://netflixtechblog.com/the-netflix-simian-army-16e57fbab116

D. Hsu and J. Liu, "Optimizing Retry Strategies for Improved Availability in Cloud Services," in Proc. IEEE International Conference on Cloud Engineering (IC2E), 2023, pp. 278-287.

Downloads

Published

2024-06-04

How to Cite

Suraj Modi, & Preetham Vemasani. (2024). SERVING MASSIVE CATALOG DATA AT SCALE WITH VERY HIGH AVAILABILITY AND LOW LATENCY. INTERNATIONAL JOURNAL OF ADVANCED RESEARCH IN ENGINEERING AND TECHNOLOGY (IJARET), 15(3), 279-287. https://lib-index.com/index.php/IJARET/article/view/IJARET_15_03_024