my picture

Anthony Awuley Machine Learning and Software Engineer

12 Mar

Building Ozone Level Forecasting System Using Genetic Programming on a Distributed Computing Platform

This work looks at im- plementing a distributed evolutionary algorithm computational system to model an environmental data provided by Texas Commission on Environmental Quality (TCEQ). It is aimed at building a highly reliable ozone level forecasting system using 2,536 datasets gathered over 7 years with each dataset having 72 measured environmental features.

Genetic Programming (GP) will be used as an evolutionary technic to evolve computational prediction models as well as feature selection for the given dataset. Given the high dimension of the dataset, the GP system will be de- ployed concurrently on a distributed computational model using island model parallelism on a set of five homogeneous processor nodes. The best evolved result, should reliably determine relevant features out of the 72 attributes as well as predict future ozone status given an unknown distribution of environmental data.

GP will be deployed on the distributed system using island model while panmictic model will be used on the single PE. Using t-test, there was no difference in observed results (prediction rule) between the distributed algorithm (island model) and the sequential algorithm (panmictic model) at 0.05-confidence level. However the island model expressed a higher generality in its results. Results for the distributed algorithm was not cost optimal even though a significant speedup of 4.55 was achieved. The distributed algorithm outperformed the sequential algorithm by using a minimal of 21 out of the 72 measured features in the dataset to evolve a high predictive rule.

Leave a Comment