This is a casual post to dump some target sites for scraping or just project ideas. Those messages were first sent through COMM7780/JOUR7280 WeChat group. Although we have only explored part of those possibilities this semester, the list is good for future reference. We can bounce off ideas in the comment below and enrich this list.
Top Targets: Movie, Shopping and News
Let’s first have a look at what the students care about from HW2 submission:
Open Data Day is an annual celebration of open data all over the world. In the year of 2018, more than 400 cities simultaneously organise hackathons on Mar 3. According to one Hong Kong organiser, Bastien Douglas, most local organisers of ODD are government affiliates. In Hong Kong, communities like OSHK and ODHK lead the organisation every year. One highlight for ODD-HK-18 is the talk from Jessica Lo, the system manager from OGCIO responsible for the open data portal: data.gov.hk
It is time to break-down the broad concept of “data journalism”. When talking about the combination of data and news, we usually refer to two processes, sometimes conducted in an integral manner. One process is to discover news points from datasets. The datasets can provide a lead for further investigation. The final product does not necessarily reflect the usage of data. It may look the same as normal news products mainly composed of interviews and photos. This is called “data mining” in the science domain. Another process is to present news points using data. There come to all kinds of charts and interactive/ immersive presentations. This is called “data visualisation” in the science domain.
Let’s focus on the “data mining” part in this article. That is to discover news from datasets, or more precisely discover a news lead from datasets. The further development of the entire news story may take much more efforts with a combination of traditional and modern methods. For easier discussion, we treat “news” in the general form: something the audience does not know before reading, a.k.a, something that “appears new”. It could be the status update of a current affair, or it could be the “new knowledge” to the readers (probably be “common knowledge” to experts which we don’t want to waste time debating).
As advocated by the “Road to Jan”: the most profound theory takes the simplest form. As a first step, we try not using programming, or even sophisticated spreadsheet skills. One can readily find some “news” with a bit “nose for news” and be computer literate is good enough. In this article, we will demo a few news points mined by our undergraduate students from Hong Kong government data portal: https://data.gov.hk . It took around 20 minutes in the second class of a data journalism course. We start with a public dataset from the portal, check out the data tables and eyeball if there is anything interesting. The process is so quick that we would like to give it a brand name: Lightning News. One can sharpen his/her news sense and data sense by doing this as daily exercise.
If you still don’t know what is “blockchain” or what is “bitcoin”, The recent work from Max Galka will assure you this is the high time to do some self-study, or you will miss the birth time of “another Internet”. The idea of ICO, Initial Coin/Chain Offering, is an analogy of IPO. With the inception of “smart contract” capability, fundraising, a process to exchange currency to certificate, can be done in a distributed manner. The “currency” in the chain world can be Ethereum, NEO, BitCoin, … The “certificate” in the chain world is called “token” so the ICO process is also referred to as “token sale”. The convenience of ICO gains rapid growth with crazy capitals pouring into this field. Just check out this interactive/ animated token sale history.
Data is the key for environmental investigation and monitoring. However it is very hard for ordinary citizens to get access to. Let water quality be example, which is associated closely with our daily life. When serious environmental disasters break out, with limited information disclosure from government, general public can hardly know the truth in time. The motivates us to organise this workshop that enables you to make DIY monitoring devices with open technology.
We are pleased to announce a 3 days data journalism bootcamp at the end of January. This is an intensive training to get you onboard this fascinating battle ship in the new media ocean. You will spend a fruitful weekend with 60 students from all Hong Kong higher education institutions. The event adopts a “startup weekend” format and features hands-on experience. Friday evening will see an overview of data journalism and team formation. Teams can work at any time from Friday evening all through Sunday afternoon to finish a data journalism project. Saturday is composed of three structured workshops including data collection/ preprocessing, descriptive statistics and data visualisation. Sunday morning will see some industry practitioners/ community contributors sharing tips/ pointers to further broaden the horizon of participants. Most training sessions are optional and attendees can pickup the preferred skills as needed.
Date/Time: Jan 26 (Fri) evening to Jan 28 (Sun) afternoon
Day 1 (Fri): Cheng Yu Tung Building (100m from MTR University Station)
Day 2 (Sat)/ Day 3 (Sun): Learning Garden, G/F, University Library, CUHK
Audience: Students in Hong Kong higher education institutions
Photo: Will Su on KANTAR Information Is Beautiful 2017
We have invited Will SU Jiahao, the winner of Information Is Beautiful Award 2017 to share his experience on data visualisation. As someone who entered the data visualisation industry with zero knowledge in neither programming nor statistics, he will talk about how he transitioned from being a traditional graphic designer to a data visualisation specialist within the span of one year. The process involves picking up programming skills and becoming comfortable with both front-end web development and back-end dev ops. He will also touch on the exciting process of visualising data, as well as some of the common questions and obstacles new comers may face, how to overcome them, and progressively acquaint themselves with the work-flow of a web-based data visualisation storytelling piece.
Date: Jan 12 (Fri), 2018
Time: 11:30am – 12:10pm
Venue: Room 1024, Communications and Visual Arts Building, HKBU