Data Transformations on big volumes: Cleaning, Feature Engineering, make the data ready for statistical analysis and Machine Learning, in batch and real-time
Unstructured Data to Structured Data (e.g: XML and Logs transformations to columns formats)
Spark / Hadoop YARN (Dataframes, RDDs, Streaming) and Hadoop HDFS