The objective of this competition is to use your knowledge of politics and elections and your statistical and data science skills to create the most accurate prediction you can for Japan’s 2019 Upper House Election.
Teams may create their models using any publicly available source of data including past election results, opinion polls, demographic and economic data, news reports and social media texts, or anything else they can think of – they are encouraged both to study and learn from well-established theories and methods of electoral forecasting, and to innovate using their own original ideas and hypotheses.
Forecasts must be submitted before polling day, and the winners will be determined based on how closely the forecast predicted the actual election results and the evaluation of their performance by a panel of judges.
You need to submit three different types of files:
- two CSV files with predictions (see below),
- script/code used to create your model(s),
- slides (PDF, max 8 pages), in English or Japanese, describing data and methods.
Note: the slides have to be saved as a PDF file. In addition, the page size should be A4, and the orientation should be landscape. Animations and videos are not permitted. [added 2019-07-03]
Japan’s Upper House election consists of the two election tiers: The PR (proportional representation) tier, and the District tier. You must submit two CSV files, one for each tier.
The PR tier: the CSV file for the PR tier should contain the predicted vote share for the seven parties specified by the event organizers (Liberal Democratic Party, Constitutional Democratic Party of Japan, Democratic Party for the People, Komeito, Japan Innovation Party, Japanese Communist Party, and Social Democratic Party). The submission file will have the following format:
(Note: while the party list is unlikely to change, party mergers etc. are possible; the final list will be distributed to teams when the election is officially called.) [Added 2019-05-13]
party, vote_share Party A, 30.00 Party B, 29.99 etc.
The District tier: The CSV file for the district tier should contain district name, candidate name, predicted outcome (1 for “win”, 0 for “lose”) for each candidate. Note that some districts elect multiple members. The submission file should have the following format:
district, candidate_J, candidate_E, outcome Hokkaido, 石田一郎, Ishida Ichiro, 1 Hokkaido, 山田太郎, Yamada Taro, 0 Hokkaido, 佐々木藍子, Sasaki Aiko, 1 etc.
…where “candidate_J” and “candidate_E” refer to the candidate names in Japanese and English (Romaji), respectively (provided by the event organizers). The submission file should contain predicted outcome for all the registered candidates, not just for likely winners.
Note: The total number of candidates with a value of “1” in the “outcome” column (i.e. the number you forecast to win district seats) should be no more than 74 – the number of district seats up for election. The upload system will automatically reject entries which forecast an incorrect number of seats.
The list of parties, districts and candidates will be provided by the event organizers after the election is officially called (only in English for the parties, and both in Japanese and Romaji for the candidates).
The PowerPoint or PDF file submitted to explain your method should consist of eight (8) slides, which (a) will be printed out as a poster and placed on display on the day of the prize-giving; and (b) may be used by the team to make a short, eight-minute presentation at the prize-giving ceremony. [Added 2019-05-13]
The accuracy of a team’s submission is evaluated on how close they are to the actual election results – i.e. the number of seats correctly predicted.
Details: for the PR tier, seats are distributed based on the team’s predicted vote share using the D’Hondt method. The sum of absolute differences between the simulated seats from your prediction and the actual seats will be divided by two (since each incorrectly predicted seat results in both a false positive and a false negative). This penalty is subtracted from the highest score for the PR tier, 50 (the number of seats contested). Event organizers will simulate the number of seats based on the vote share predicted by each team.
For the District tier, the number of elected candidates correctly predicted will be used.
The maximum points a team can achieve for accuracy, adding both the PR tier and the District tier, will be 124 (i.e. the total number of seats contested, 50 for the PR tier and 74 for the district tier). [Added 2019-05-13]
In addition to accuracy of prediction, teams are also evaluated by a panel of judges for the quality of their model (including factors such as innovation and general applicability) and their presentation.
You may use any publicly-available data to predict election results. You must be able to submit all the data used for the prediction to the event organizers upon request without violating any relevant rules and regulations. If the data used in the analysis cannot be shared with a third party, a script or some other material should be provided upon request.
If you are using data from Twitter in your forecast, you must abide by the company’s usage regulations. Teams intending to use Twitter data should carefully read the regulations linked from the “Data” page on this site. [Added 2019-05-23]
Data and scripts may not be shared with other teams. [Added 2019-05-13]
IMPORTANT: You are NOT allowed to publicly disclose your election forecasts until the election is over, as it will violate the Public Offices Election Law.
Teams may consist of undergraduate or graduate students from any school of Waseda University and students from Waseda University Senior High School and Honjo Senior High School. Teams that include research associates and post-doctoral researchers may enter but are judged separately. Members of faculty may also enter the competition but are not eligible to win any prizes. Each team should contain at least 2 members and no more than 4 members.
Multiple team membership is not permitted; individuals may only enter as part of a single team. [Added 2019-05-13]
In addition to file submission, all teams should make a presentation on their method and results on the day of the award ceremony to receive a prize. Not all team members have to be present, but at least one member should be able to make an oral or a poster presentation.
Teams are encouraged to include a range of members in order to get different ideas and perspectives on the challenges involved in the competition. A special prize may be awarded to a strongly-performing team which includes a mixture of members of different genders, nationalities, ages, or academic backgrounds, or which includes members of minority groups. [Added 2019-05-13]
- Grand Prize: ¥100,000
- SPSE Prize: ¥50,000
- ITOCHU Techno-Solutions Prize
- ADK Prize
- Data Scientist Fes 2019 Prize
- Hitachi Prize
- BrainPad Prize
- Mizuho Bank Prize
It is possible for a team to win multiple prizes. The SPSE Prize is reserved for students from the School of Political Science and Economics and is awarded to the best team that consists of SPSE students. Team(s) that receive the Grand and SPSE Prizes are not eligible for other prizes.
- June 26, 2019: deadline for team entry
- 17 days before the Election Day: Election officially called
- One day before the Election Day: deadline for file submission
- July 14 or 21, 2019: Election Day (tentative)
- July 27, 2019: presentations and award ceremony; reception
- Center for Data Science: Manabu Kobayashi, Tota Suko (Faculty of Social Sciences)
- Faculty of Political Science and Economics: Airo Hino, Atsushi Tago, Michiko Ueda
- Waseda Institute of Political Economy: Robert Fahey
Center for Data Science, email@example.com