The Spreadsheet Guy in “Spotlight”

MattCaroll

Hong Kong Baptist University journalism students interviewing Pulitzer Prize winner Matt Carroll in Hong Kong. Read the The Young Reporter short story here: https://m.facebook.com/tyrmag/photos/a.576662715691596/2139831466041372/?type=3&theater=

 

 

Data Journalism Open Lecture

Time: Nov. 2, 2018 (Fri.), 11:00 a.m.-12:20 p.m.

Venue: CVA 506, HKBU

Speaker: Andy Shu

Language: English

DJOP_021118 Final.jpg

Continue reading “Data Journalism Open Lecture”

Data Journalism Online Open Lecture

Time: Oct. 12, 2018 (Fri.), 9:45 a.m.-10:45 a.m.

Venue: CVA 506, HKBU

Speaker: Aaron Mendelson

Language: English

Poster.png

Continue reading “Data Journalism Online Open Lecture”

Open call for portal “general site” operation student helpers

binary-1979478_960_720.jpg

This post is an open call for 5 student helpers to join the portal operation team. Interested students please send your CV to: hupili@hkbu.edu.hk . Please use title “[Application] Portal General Site Operation Student Helper” also include the following information in email for quick reference:

Name; sID; year of study; concentration (a.k.a “J”, if available); prior experience to WordPress (if available); expected length of commitment.

Continue reading “Open call for portal “general site” operation student helpers”

Symposium|Fall Symposium on Digital Scholarship 2018

 

Time: October 25, 2018 (Thursday)

Venue: AAB 201, Academic & Administration Building, Hong Kong Baptist University

Language: English

Online Registration: here

Welcome to join us!

Agenda

poster.jpg

Guest Speakers

DR. DONALD WATERS

Senior Program Officer, Andrew W. Mellon Foundation

DR. MIGUEL ESCOBAR VARELA

Assistant Professor, English Language and Literature, National University of Singapore

MR. ANDY CHO

Consultant, Radica Systems Limited

Internal Speakers

PROF. CLARA HO

Head, Department of History, Hong Kong Baptist University

DR. ANGEL LAI

Research Assistant Professor, Department of Social Work, Hong Kong Baptist University

PROF. DAVID CHUNG

Professor, Department of Music, Hong Kong Baptist University

MR. PILI HU

Lecturer, Department of Journalism, Hong Kong Baptist University

Data & News Salon #2 | From open source and open data to data journalism and civic engagement — Practice from MirrorMedia and READr

從開放源碼、開放數據到數據新聞、公民參與 — 來自鏡傳媒與READr的實踐

From open source and open data to data journalism and civic engagement — Practice from MirrorMedia and READr

 

Time: 14/6/2018 (Thur), 1:30 p.m.-2:30 p.m.

Venue: CVA105, HKBU

Speakers: 簡信昌 & 李又如

Language: Mandarin | 國語

Register Here

Welcome to join us!

1528685191(1)

Continue reading “Data & News Salon #2 | From open source and open data to data journalism and civic engagement — Practice from MirrorMedia and READr”

[Tools] Online Verification Tools – Notes from Google @ GEN 2018

At this year’s GEN (Global Editors Network) summit, Google News Initiative shared their notes regarding data journalism resources, particularly for investigative research and verification by online forensics. Here are some excerpts and we get a copy from Google News Lab, which you may find the complete notes attached at the end.

Research Tools

  • Google Public Data Explorer provides public data and forecasts from a range of international organizations and academic institutions. Visualized data are ready to interpret.
  • Google Trends compares search terms in a country and timeframe of your choice.

Verification Tools

Find more on data journalism training and multimedia toolsets below:

Q2 2018_ Workshop Notes 2018

第六屆香港立法會投票記錄分析(2016-2018)

本屆立法會波瀾起伏,6名議員先後被取消資格(俗稱「DQ」)。時值2018年3月補選之際,我們進行該項研究,通過本屆投票記錄的數據,爲公衆還原各議員的「政治肖像」。從這份數據驅動的光譜與投票熱力圖,我們可以看出,議員們究竟是言行一致,還是逢場作戲。在公衆熱烈討論「投票率新低」和「田忌賽馬失敗」之餘,我們也要知道,誰當選固然重要,當選後做什麼也許更加值得關注,會議的提案與投票記錄詳細地反應了議員的行爲。

議員投票傾向光譜

從第五屆立法會開始,電子投票記錄以XML結構化數據的形式公佈於網站上,供公衆下載。我們通過爬蟲蒐集第六屆立法會議員從2016年11月10日至2018年3月29日做出的電子投票記錄,共27426票紀錄,對於每個議案,議員可能產生五種不同的表決結果——贊成、反對、棄權、缺席或出席(即出席會議,但未投下「贊成」、「反對」或「棄權」中的任何一票;立法會小百科 )。將不同的表決結果數值化後,使用主成分分析法(Principal Component Analysis,以下簡稱為 PCA),計算2萬票紀錄反映出的最大分歧,即主維度(Principal Axis),記為 PA1。同時我們計算每位議員的投票紀錄在 PA1 上的投影值,即主成分(Principal Component),記爲 PC1。PC1 體現議員之間的相對關係,兩位議員得到的分值越接近,則說明他們的投票傾向也越接近。

按照 PC1 分值由小到大排序,就得到一條數據驅動的「政治光譜」(如下圖所示)。在這條「光譜」中間,是從不投票的梁君彥。越靠近梁君彥的人,投票風格就越溫和。越遠離梁君彥的人,投票風格就越激進。而梁君彥的兩邊,按照議員所屬派別塗色,恰恰是建制和泛民兩派人馬,與常識相符。

1
第六屆立法會(2016-2018)投票傾向光譜(按建制、泛民分類)
(已等比例縮放到第5屆立法會光譜的同等區間)

從相對距離上來看,泛民陣營整體離梁君彥更近,而建制則離得較遠。圖像顯示,建制派整體而言在投票中表現得比泛民更加激進。即建制派的建制立場,要強過泛民的民主立場。

光譜的兩端,分別是鄭松泰和盧偉國,這說明兩人的投票風格差異最大,相比其他議員,他們二者的投票風格最為激進。且根據統計,在投票記錄中,兩人投票意見不相同的提案數(371個)是兩人意見相同提案數量(37個)的十倍之多。

Continue reading “第六屆香港立法會投票記錄分析(2016-2018)”

Syria’s toxic war on itself

The Middle Eastern nation Syria has been in a state of civil war since last seven years with different groups trying to seize control of the country. The country has become an international battleground where various states and their proxy networks have been continuously clashing with each other. The war has taken the lives of more than 465,000 people so far and displaced more than 12 million, of which 6 million refugees have been dispersed around the world.

About Datasets

A media documentation — the Syrian Archive Dataset is an open source platform that collects, curates, verifies, and preserves visual documentation of human rights violations in Syria. It maintains an extensive video database of all known allegations in which civilians have been reported killed or injured since 2014. Till April 20, 2018, this database includes 4,384 videos which were documented by journalists, citizen reporters and activists.

A recorded death list — the Violations Documentation Center in Syria is one of the largest human rights organisations established in 2011 with staff members and contacts in all governorates and most cities inside Syria.

The complex nature of the war in Syria limits access to open database. And therefore, the data extracted could miss some important information; however, we will be analyzing the situation in Syria with precision by filling some of the gaps with the help of other dataset.

On the morning of April 14, 2018, the US, Britain and France bombarded three government sites in Syria allegedly targeting the chemical weapons facilities. Is it true that Syria has been continuously suffering from its internal turbulence which needs to be intervened by foreign players?

We drew a general picture of the Syria attacks based on the dataset of 329 which were recorded from January 1, 2017 to April 20, 2018.

Part 1 General Picture

Living Hell

In the war-torn country, Aleppo, Damascus, Idlib, Hama and Daraa are the cities documented by both the databases as the locations where most of the violations took place, despite some slight differences on the rankings of these locations.

In the war-torn country, Aleppo, Damascus, Idlib, Hama and Daraa are the cities documented by both the databases as the locations where most of the violations took place, despite some slight differences on the rankings of these locations.

图片 1.png
The media coverage of the locations where most violent incidents happened are highly identical to the locations recorded in the actual death list.

To be specific, the Syrian Archive, which demonstrates media coverage, witnessed most violations in Aleppo (1,920), followed by Idlib (219), Hama (103), Damascus (97) and Homs (39).

The Violations Documentation Centre of Syria, which records the actual registered death list, also presented Aleppo (7,990) as the most violation prone city in Syria, Damascus (6,372) stood tall at second, Idlib (4,434) and Deir Ezzor (2,904), a city which was absent in the media coverage database.

In terms of geographic distribution of the violent incidents in Syria, Aleppo and Idlib are the two cities ranking among the tops in both the documentations and have been the most disputed regions taken up by either rebels or jihadists, thus these are the locations where the Syrian regime and its allies have been concentrating their firepower.

Continue reading “Syria’s toxic war on itself”

Flying in the sky, a report of air crash worldwide

Cover.png

1/2560000 in 2016  VS.  ?  in history

In the past 70 years, Airplane has been an important tool for people to travel long distances. According to IATA annual report In 2016, the major aviation accident rate was 0.39, which was equivalent to only one major accident happen in every 2.56 million flights. This seemingly safe number is built on countless blood and sweat. Step down and turn back a little bit, let’s count the successes and failures in the flying history.

Data source

Data volume

  • 5534

Questions

  1. Yearly how many planes crashed? any trend? how many people were on board? how many survived? how many died?
  2. How the distribution of accidents between military and passengers? any insights?
  3. The highest number of crashes by operator and type of aircraft. The relationship between operators and types of airplanes?
  4. Find the airline routes with most accidents and try to find the reasons.
  5. Find any interesting trends/behaviors that we encounter when we analyze the dataset.

History of airplane accidents

Count of accidents by Year

A1.png

Form the picture, we can see the total accidents trend from low to high before the 1970s. After that, there are some small peaks around 1990 and 2010. But the overall trend after 1990 is gradually going down.

At the beginning of 20 centuries, 1903, Wright brothers invented plane. In 1909, French hold a big flight competition, which threatened the England and other European countries. Even there were many problems with the current planes, the military can’t wait for using it in war. The first time of airplanes’ appearing thus was in Italo-Turkish War. The power of airplane attracted other countries’ military, which leads a huge development in the military aviation industry. From 1914, the first world wartime, airplane mainly used for investigating, transporting, and some peripheral things. At the time of world war II, which is around 1940. Airplanes had widely used in battle. At the same time, World civil aviation organization (IATA) established in Havana, the capital of Cuba in 1945. In 1978, Cater, the president of the USA, signed meaningful a law in the history of American aviation legislation, which is <the airline deregulation act>. The establishment and merger of companies in the US domestic aviation industry, route selection, fare establishment, and even loss-making operations, are basically out of government control and intervention. The number of airplanes grows up fast with the high possibility of air crash occurred. The other reason we consider is that airplane technologies that at that time had weakness and need to improve. With the technologies completed, the amount of air crash will decrease. These situations are obvious after 2000.

Continue reading “Flying in the sky, a report of air crash worldwide”