PigGene - A platform for the graphical creation of Apache Pig workflows

Thesis Type Bachelor
Thesis Status
Finished
Student Clemens Banas
Init
Final
Start
Thesis Supervisor
Contact
Research Field


Due to the vast progress in sequencing, data volumes in bioinformatics are increasing dramatically. Therefore, the aim of this bachelor thesis is the development of a graphical generation platform for Apache Pig scripts called PigGene to analyze data in a massively parallel way. Users are guided through the building process of scripts by an intuitive user interface and are able to generate workflow definitions for complex data queries on the fly. Generated scripts are ready to use and can be further executed directly on Cloudgene, a graphical MapReduce execution platform. PigGene can be extended by user scripts and supports especially inexperienced users in the area of bioinformatics.