Earthquakes in Southeast Asia in 50 years

Summary: We used API (Application Programming Interface) as the source to extract data from the USGS database in order to analyze the last 50 years and estimate the frequency of earthquakes in Southeast Asia. With the help of Python, the extracted data was exported into CSV file for categorizing different parameters such as by country, magnitude and year.


Application programming interface (API) is commonly used to extract data from a remote website server. In layman term, API is used to retrieve data or information from another program. There are several websites such as Facebook, the USGS, Twitter, Reddit, which offer web-based API helping get information or data.

In order to retrieve data, we will send requests to the host web server where you want to extract the data and tweak parameters like URL in the module to connect to the server. Different websites have different requests format and can easily be accessed through the host’s website.

In our module, we will be extracting the data of earthquakes that hit Southeast Asia in the last 50 years from the web server of USGS using API.

One of the most frequent natural disasters on planet earth is earthquake. The sharp unleash of energy from Earth’s lithosphere generates seismic waves which lead to sudden shaking of the surface of the Earth. This natural disaster has led to the death of thousands of millions of people all around the world.

The strength of earthquakes is measured through Richter magnitude scale or just magnitude. The magnitude is the scale which ranges from 1-10.

The most highly sensitive region in the world prone to the earthquakes are Southeast Asian countries. To find the trend in the region, we extracted 50 years data from USGS by using API and convert the numbers into CSV file through Python coding for a comprehensive understanding of earthquakes situation in Southeast Asia.

Continue reading “Earthquakes in Southeast Asia in 50 years”

Using Big Data to Figure Out How Fair China Daily News is

Summary: Unfair and imbalanced news stories always mislead readers, hiding and even distort truths, thus decreasing the credibility of media as well as increasing ‘news victims’. As a qualified news organization, one must get its news as close to the fact as possible. This time we want to take China Daily as an example, to analyze whether its news is fair or not.

We decided to rely on data to quantize the requirement, thus we use python to show the most effective way to figure out the fairness of news.

Background: Difficulty to Reach Absolute Objectivity

According to the Cambridge online dictionary, objectivity means “not influenced by personal opinion or feeling.” For a long time in journalism, objectivity meant writing a story without putting any personal opinion into it.

Over the last several years, many journalists stopped using “objectivity” in favor of the word “fairness.” Complete objectivity, they reasoned, is impossible. Fairness is more possible. Fairness means that you tell a story in ways that are fair to all sides once all the available information is considered.

Telling a story fairly is more difficult than it sounds. Reporters try to put colorful images and descriptions into their stories. For fresh reporters, especially those working in a second language, it can sometimes be difficult to distinguish between colorful description and editorializing. Some words have a feeling or connotation to them that is hard to recognize. Some English words have “loaded” or “double” meanings that are extremely positive or negative. Writers should be aware of the positive or negative meanings of a word and how its use to affect an article. Also, as human beings, we all have feelings and opinions about events and issues around us—-it is sometimes difficult to conceal those feelings, especially if we feel strongly about something. These feelings sometimes come through in our stories in the words we choose.

Therefore, the TextBlob, a module of python, is designed for pointing out humans’ subjectivity in news.

Continue reading “Using Big Data to Figure Out How Fair China Daily News is”

Data News of the Week | Gender Pay Gap: Why and How?

Professor Jordan Peterson has been the center of attention in last few weeks for participating in a number of debates regarding gender wage gap. Unlike the feminists calling for the reduction of discrimination over salary, he believes gender wage gap is an explainable consequence of multiple social factors rather than a problem caused by discrimination. Is he right? After all, why is there gender wage gap? Looking into 3 reports (Why is There a Gender Wage Gap – Our World in Data, Six Key Facts About the Gender Pay Gap – Our World in Data, Gender Pay Gap: the Day Women Start Working For Free – Washington Post) and a  published recently with analytics over statistics regarding gender wage gap will give us a thorough understanding of current gender wage gap. Continue reading “Data News of the Week | Gender Pay Gap: Why and How?”


【轉載注】特朗普自從上臺以來,一直是媒體和學者關注的焦點。這位「推特治國」的總統,不僅極具話題性,也伴隨着豐富的數據集。這無疑是政治新聞報道中,非常適合數據驅動報道的議題。這篇文章來自兩位港大的同學,初稿形成于 Open Data Day Hong Kong 2017 的黑客松,由 Initium Lab 編輯和發表。轉載此文有兩個契機。一是 Open Data Day Hong Kong 2018 將于3月3日在港大舉行,屆時全港的開放數據行動者、公民科技愛好者、記者、學者、市民將匯聚一堂,發起專案,並在一天的時間內做出原型。部分項目組會在活動之後繼續研發,形成出色的數據應用或者數據報道。這篇文章是一個經典的案例,無論從選題、數據搜集/分析/可視化,還是項目執行,都極具代表性。黑客松讓不同背景的參與者,在高壓下腦力激盪、通力合作,可以很高效地找到有趣的選題,並做出原型。而將原型轉化爲最終作品,往往會花上數倍于黑客松現場的時間,並且需要專業技能的介入。希望通過這篇文章,讓正在努力學習 Python、R、Javascript 的傳播同學看到一種可能性——獨行者最速,衆行者最遠。轉載的第二個契機是,最近NBC發出了有關俄羅斯在Twitter上虛假帳號的數據集和報道。特朗普的崛起讓大量精英階層感覺到是一記耳光,他們慌了,不斷苛責媒體和社交網絡。究竟俄羅斯有沒有從中作祟?作用有多大?爲什麼特朗普的支持者如此之多,但民調竟沒發現?是隨機誤差還是系統誤差?這些疑問會在很長一段時間內不斷閃現,而人們熱衷於各種蛛絲馬跡。可以說,盯住特朗普、盯住Twitter總會有用不完的數據,寫不完的故事。這篇文章是很典型的文本分析于可視化,用R完成,可借鑑處頗多。


美國新晉總統唐納德·特朗普(Donald Trump)以其極端言論在一眾政客裡獨領風騷。端Lab曾於2016年撰文分析特朗普與其競選對手希拉里·柯林頓(Hillary Clinton)面對媒體採訪時不同的言論風格,發現特朗普發言多用簡單句型,且善於用第二人稱敘事獲取觀眾共鳴。

Continue reading “轉載:特朗普父女推特解密(ODD-HK-17作品)”

Data News of the Week | What can we, the 20-year-old, do to change the world?

Nathan Ruser, a 20-year-old Australian National University student who is majoring in international security with a keen interest in cartography, discovered a fitness app had revealed the locations of secret military sites in Syria and elsewhere. He posted on Twitter about this, did not expect much response.

But the news ricocheted across the internet. Security experts said the Strave app’s “heat map” could be used by hostile entities glean valuable intelligence. The Pentagon said it was reviewing the situation.

How he found the news?

 “Whoever thought that operational security could be wrecked by a Fitbit?” Mr. Ruser, said in an interview with New York Times from Thailand, where he is spending part of the Australian summer break.

When he looked over Syria on Strava’s map — which is based on location data from millions of users, including military personnel, who share their exercise activity — the area “lit up with those U.S. bases,” he said.

Before publicly sharing his findings over the weekend, he discussed them in a private chat group on Twitter, made up of people interested in intelligence and security issues. “I know about two-thirds of what I know about the world from the group chats,” he said.

Continue reading “Data News of the Week | What can we, the 20-year-old, do to change the world?”

Lightning News from Public Data Sets

It is time to break-down the broad concept of “data journalism”. When talking about the combination of data and news, we usually refer to two processes, sometimes conducted in an integral manner. One process is to discover news points from datasets. The datasets can provide a lead for further investigation. The final product does not necessarily reflect the usage of data. It may look the same as normal news products mainly composed of interviews and photos. This is called “data mining” in the science domain. Another process is to present news points using data. There come to all kinds of charts and interactive/ immersive presentations. This is called “data visualisation” in the science domain.

Let’s focus on the “data mining” part in this article. That is to discover news from datasets, or more precisely discover a news lead from datasets. The further development of the entire news story may take much more efforts with a combination of traditional and modern methods. For easier discussion, we treat “news” in the general form: something the audience does not know before reading, a.k.a, something that “appears new”. It could be the status update of a current affair, or it could be the “new knowledge” to the readers (probably be “common knowledge” to experts which we don’t want to waste time debating).

As advocated by the “Road to Jan”: the most profound theory takes the simplest form. As a first step, we try not using programming, or even sophisticated spreadsheet skills. One can readily find some “news” with a bit “nose for news” and be computer literate is good enough. In this article, we will demo a few news points mined by our undergraduate students from Hong Kong government data portal: . It took around 20 minutes in the second class of a data journalism course. We start with a public dataset from the portal, check out the data tables and eyeball if there is anything interesting. The process is so quick that we would like to give it a brand name: Lightning News. One can sharpen his/her news sense and data sense by doing this as daily exercise.

Continue reading “Lightning News from Public Data Sets”

Call for Helpers: Organising an Intra-University Competition

The School of Communication and the Department of Journalism of HKBU will organize a University-wide “HKBU Data-driven Storytelling Competition 2018” in early-mid March. We hope to facilitate the interdisciplinary teaching, learning, and discussion on data-driven news reporting and storytelling, as well as to appreciate exemplary student projects using emerging technologies.

Now the Competition Coordination Committee (“Committee”) is calling for two student helpers as event organizers to help handle event preparation and support. Duties include the following:

  1. Designing a (simple one-page) poster for the event;
  2. Designing a website with several (static) pages for event updates;
  3. Monitoring and documenting the registration process;
  4. Handling participants’ submissions;
  5. Providing contact and liaison support for different parties;
  6. Documenting the entire process, such as taking photos, writing brief essays and recaps;
  7. Other administrative and logistic support to be assigned by the Committee.
  8. Requirements: full-time HKBU students with good English and Chinese skills, prior knowledge and experience on data-driven storytelling and web design will be an advantage.

The student helper each will receive an allowance at a rate 50 HKD/hour, with a total of 10 – 15 hours.

Please submit your CV including your relevant credentials to Xinzhi Zhang ( xzzhang2 [AT] hkbu [DOT] edu [DOT] hk ) on or before 6:00 pm, 25 Feb (Sunday).

Note: Late applicants will not be accepted. Also, for the sake of fairness, student helpers are NOT allowed to participate in this competition. The Committee appreciates all the applications. However, only shortlisted candidates will be contacted.

農曆新年學習資料大禮包!Data Journalism Learning Tools

農曆新年伊始,小編在此給各位愛學習的小夥伴推薦一些學習Data Journalism有用的線上課程,資料和工具。希望新的一年,大家學業進步!順順利利!(原文部分轉載自, 有部分編輯, 點擊查看更多: 學習資料清單)

  • 系列課程:從查找資料開始、學習解讀資料的意義、資料視覺化到用資料說故事。

1.Doing Journalism with Data: First Steps, Skills and Tools(課程在LEARNO.NET平台開放,一共5節,直接註冊,免費學習)

2.Data Exploration and Storytelling(數據大師Alberto Cairo & Heather Krause開設)

Continue reading “農曆新年學習資料大禮包!Data Journalism Learning Tools”


轉載自 全球深度報導網 Repost from Global Investigative Journalism Network (GIJN)

Original Article(本文由簡體中文轉爲繁體中文)

媒體在數字化轉型中,越來越多地用數據可視化作為呈現方式。但許多可視化的作品只是追求形式上的美感,沒有實現數據可視化真正的功能:清晰有效地傳達信息,使讀者更形象地理解數字背後的含義。媒體推出的可視化新聞良莠不齊,水準忽上忽下,卻少有懂行的同行或讀者給予反饋。本週的數據新聞給讀者介紹由Perceptual Edge網站創始人、信息視覺化專家Stephen Few提出的一個標準化衡量規則,評價可視化是否達到了必要的效果。


Continue reading “轉載:如何標準化數據可視化之「美」?”

Recap of Covering Open Space 101 Workshop

Civic Exchange  (思匯政策研究所,“CE” hereafter, a think tank) held a workshop on Covering Open Space 101 Workshop in HKBU on 7 Feb 2018, as a part of the collaboration with HKBU JOUR to conduct a series of investigative reports on open space in Hong Kong. CE’s researcher, Carine Lai and an experienced journalist, Christopher DeWolf, were invited to share their experience and techniques in doing open space research in Hong Kong. After the learning sessions, students tried to use the maps to practice the knowledge learned from the workshop. Here is the recap of the event.

Continue reading “Recap of Covering Open Space 101 Workshop”