HadoopLens consist of a family of Big Data Management toolkits that are essential for any Hadoop deployments.
They enable parallel data migration, ingestion, encryption and compression over large datasets in Hadoop.
The toolkit consist of a Hadoop Cluster Migrator that migrates data across firewalled secured cluster where DistCpy doesn’t work.
The Hadoop Compressor performs a data backup from one cluster to another secured cluster with customizable compressed options.
The Hadoop DC Replicator creates real time data replication across Data Centres and is very handy utility for Hadoop Recovery solutions.
The Hadoop Folder Comparator provides a difference report comparing two folders across different clusters and is a handy tool for synchronization and backup.
The SFTP Hadoop Uploader provides parallel data upload to Hadoop Cluster without any intermediate landing and thus transferring data at high speeds.
The utility can do file format conversion on the fly and can generate Parquet/AVRO or any other file formats for Hadoop.
This is also an excellent utility for addressing the small file problem.
The Hadoop Small File Merge utility performs parallel ingestion of small files into a Hadoop cluster and converts the ingested file into Parquet/AVRO with different compression type enabled.