Skip to the content

A data mining approach to analysis and prediction of movie ratings

Saraee, M, White, S and Eccleston, J 2004, A data mining approach to analysis and prediction of movie ratings , in: The Fifth International Conference on Data Mining, Text Mining and their Business Applications,, 15-17 September 2004, Malaga, Spain.

[img]
Preview
PDF - Published Version
Download (357kB) | Preview

    Abstract

    This paper details our analysis of the Internet Movie Database (IMDb), a free, user-maintained, online resource of production details for over 390,000 movies, television series and video games, which contains information such as title, genre, box-office taking, cast credits and user's ratings. We gather a series of interesting facts and relationships using a variety of data mining techniques. In particular, we concentrate on attributes relevant to the user ratings of movies, such as discovering if big-budget films are more popular than their low budget counterparts, if any relationship between movies produced during the "golden age" (i.e. Citizen Kane, It’s A Wonderful Life, etc.) can be proved, and whether any particular actors or actresses are likely to help a movie to succeed. The paper also reports on the techniques used, giving their implementation and usefulness. We have found that the IMDb is difficult to perform data mining upon, due to the format of the source data. We also found some interesting facts, such as the budget of a film is no indication of how well-rated it will be, there is a downward trend in the quality of films over time, and the director and actors/actresses involved in a film are the most important factors to its success or lack thereof. The data used in this paper is not freely distributable, but remains copyright to the Internet Movie Database inc.

    Item Type: Conference or Workshop Item (Paper)
    Themes: Media, Digital Technology and the Creative Economy
    Schools: Colleges and Schools > College of Science & Technology > School of Computing, Science and Engineering > Data Mining and Pattern Recognition Research Centre
    Journal or Publication Title: Transactions of the Wessex Institute
    Publisher: Wessex Institute Press
    Refereed: Yes
    Depositing User: Dr Mo Saraee
    Date Deposited: 03 Nov 2011 16:06
    Last Modified: 20 Aug 2013 18:17
    URI: http://usir.salford.ac.uk/id/eprint/18838

    Actions (login required)

    Edit record (repository staff only)

    No Altmetrics available

    Downloads per month over past year

    View more statistics