Data Science with MSc Program in Statistics
The MSc Program in Statistics with a new and updated curriculum is calling for applications for the 2020 academic year. Applications for both the part-time evening program and the full-time day program are invited. The new curriculum has been designed around the "Greater Data Science" discipline.
The new and updated curriculum has been designed around the "Greater Data Science" (GDS) discipline [see Donoho (2017)]. The GDS consists of the following six divisions:
- Data Gathering, Preparation and Exploration
- Data Representation and Transformation
- Computing with Data
- Data Visualization and Presentation
- Data Modeling: Machine Learning and Statistical Inferences
- Science about Data Science
Our program has been designed to cover all aspects of the GDS with particular emphasis on both Data Visualization and Data Modeling.
Visualization is an important and essential ingredient of data science. The most basic use of a visual output is to communicate an analysis result in easily interpretable returns for all kind of users. A well-designed visual display can be more approachable than lists of numbers and texts with technical terms. To bring humans closer to the data analytics loop, visualization can also act as an interface between machine intelligence and domain expertise. Not only can visualization allow effortless data consumption, but it can also enable other types of data analytics. The so-called "visual analytics" is suitable for exploring any new datasets and for forming questions to be examined further either using additional data science methods.
Data modeling aims at providing an understanding of the data at a deeper level. Machine learning and statistical inference are the main tools of this division. Machine learning and predictive modeling is based on the construction of a model to detect patterns in a data set, and it can be used to provide accurate predictions for a given data universe. Deep learning is an example of predictive models that has been successfully applied in various applications such as computer vision and natural language processing. Statistical inference and generative modeling employs a stochastic model that generates data and uses the data to infer key properties of the underlying generative mechanism. Confidence and decision rules can then be derived based on the model and the data. The probabilistic graphical method is an example of the modern generative model method, which has been employed in a wide range of applications anywhere from medical diagnostic process to causal inferences. The ultimate goal of data modeling is to construct an effective decision support process within a rich data environment.
August 4 - 11, 2019, Chula-SI together with Depa, VISTEC and Max Plank Institute co-organized a workshop on machine learning MLRS2019 (Machine Learning Research School 2019). The event brought together a group of leading researchers in the field of machine learning from various countries to share their experiences with participants from Thai academics and industries.