DECENTRALIZED SYSTEMS FOR BIG DATA MANAGEMENT AND DECISION MAKING

Σπύρος Σιούτας

Περιγραφή

The course’s aim is to introduce students to the following two pillars:(1) Foundations of Advanced Decentralized Computing Systems (2) Practical Overview of non-traditional software systems for big data management (with emphasis in Spark, Python, and PySpark).

Especially, it will focus on the following topics:

  1. Hashing, Bloom Filters, Internet Caching Protocols, Distributed Hash Tables.
  2. Decentralized Data Structures and P2P Systems, DHT-based Decentralized Systems (Chord).
  3. Block-Chain and Decentralized Applications (DAPPs): Hashing Data in the Real World, Storing Transaction Data, Using the Data Store, Protecting the Data Store, Distributing the Data Store Among Peers, Verifying and Adding Transactions, Choosing a Transaction History.
  4. Distributed File Systems (HDFS), Map/Reduce Programming Framework and NoSQL Databases, Cluster Architecture, Data Flow Systems, Spark, RDDs.
  5. Overview of Python for big data management: Introduction to libraries and tools (pandas, NumPy, etc.), Introdu
Περισσότερα