Yogyakarta, September 2nd, 2022
The Master of Science and Doctoral Program (MD) FEB UGM has successfully held a Thesis Coaching event on September 2nd, 2022. This Thesis Coaching required for all Master of Science Program students who are pursuing a thesis. This event was held offline at the BRI Auditorium, 3rd Floor, MD FEB UGM Program by presenting Mr. Edi Winarko, M.Sc., Ph.D., Lecturer of the Faculty of Mathematics and Natural Sciences UGM. This thesis training raised the current topic, namely “Web Scraping,” which provide guidance to students regarding data processing sourced from the web. This event was attended by Masters of Science in Accounting, Economics, and Management.
In this training activity, Mr. Edi Winarko started his presentation by explaining the importance of web scraping techniques. Web scraping is a way of downloading data from web pages that contain various types of data, such as articles, job vacancies, etc. The data is displayed as a web page (HTML) and intended for human consumption. These various data need to be extracted from the web page before they processed by a computer program. Two methods in web scraping using libraries (exp. Python) and Point & Click (no code), such as Octoparse, Apify, and ParseHub.
Web scraping used for industrial/organizational purposes (labor recruitment, marketing, e-commerce, and retail) and individual interests (data scientists, data journalists, researchers, freelancers). Based on the survey, web scraping is often used by consumers for doing content scraping, research, contact scraping, price comparison, and weather data monitoring.
He provides an understanding of parts of the Octoparse software. The main parts are the Home screen and Sidebar. At the top of the Home screen, there is a Search bar to enter the web page to be scraped. The Sidebar menu contains the New, Dashboard, and Settings buttons. The workspace used to build a scraper (task), which is divided into five sections, such as the browser, tips, workflow, settings, and data previewer.
In the next session, there was a training session on web scraping with Octoparse software. The results of web scraping can be directly processed (exported) or stored first in the form of a task in the Workspace. The results can be seen in details, so you can choose which type of information to use. Overall, this training session was really interesting and easy to understand.
Through this training, it is hoped that students’ abilities in optimizing the use of information sources from the internet for academic purposes can be increased. There were many new things has learned from this Web Scraping training. The event ended at 11.00 a.m. and closed with a group photo. (Y)