Pages

Sunday, December 7, 2014

Terminologies related to Hadoop

  1. Eclipse is a popular IDE donated by IBM to the open source community.
  2. Lucene is a text search engine library written in Java.
  3. Hbase is the Hadoop database.
  4. Hive provides data warehousing tools to extract, transform and load data, and then, query this data stored in Hadoop files.
  5. Pig is a high level language that generates MapReduce code to analyze large data sets.
  6. Jaql is a query language for JavaScript open notation.
  7. ZooKeeper is a centralized configuration service and naming registry for large distributed  systems. 
  8. Avro is a data serialization system.
  9. UIMA is the architecture for the development, discovery, composition and deployment for the analysis of unstructured data . 

                      No comments:

                      Post a Comment