Path: blob/master/datasets/bikes/Readme.txt
411 views
==========================================1Bike Sharing Dataset2==========================================34Hadi Fanaee-T56Laboratory of Artificial Intelligence and Decision Support (LIAAD), University of Porto7INESC Porto, Campus da FEUP8Rua Dr. Roberto Frias, 37894200 - 465 Porto, Portugal101112=========================================13Background14=========================================1516Bike sharing systems are new generation of traditional bike rentals where whole process from membership, rental and return17back has become automatic. Through these systems, user is able to easily rent a bike from a particular position and return18back at another position. Currently, there are about over 500 bike-sharing programs around the world which is composed of19over 500 thousands bicycles. Today, there exists great interest in these systems due to their important role in traffic,20environmental and health issues.2122Apart from interesting real world applications of bike sharing systems, the characteristics of data being generated by23these systems make them attractive for the research. Opposed to other transport services such as bus or subway, the duration24of travel, departure and arrival position is explicitly recorded in these systems. This feature turns bike sharing system into25a virtual sensor network that can be used for sensing mobility in the city. Hence, it is expected that most of important26events in the city could be detected via monitoring these data.2728=========================================29Data Set30=========================================31Bike-sharing rental process is highly correlated to the environmental and seasonal settings. For instance, weather conditions,32precipitation, day of week, season, hour of the day, etc. can affect the rental behaviors. The core data set is related to33the two-year historical log corresponding to years 2011 and 2012 from Capital Bikeshare system, Washington D.C., USA which is34publicly available in http://capitalbikeshare.com/system-data. We aggregated the data on two hourly and daily basis and then35extracted and added the corresponding weather and seasonal information. Weather information are extracted from http://www.freemeteo.com.3637=========================================38Associated tasks39=========================================4041- Regression:42Predication of bike rental count hourly or daily based on the environmental and seasonal settings.4344- Event and Anomaly Detection:45Count of rented bikes are also correlated to some events in the town which easily are traceable via search engines.46For instance, query like "2012-10-30 washington d.c." in Google returns related results to Hurricane Sandy. Some of the important events are47identified in [1]. Therefore the data can be used for validation of anomaly or event detection algorithms as well.484950=========================================51Files52=========================================5354- Readme.txt55- hour.csv : bike sharing counts aggregated on hourly basis. Records: 17379 hours56- day.csv - bike sharing counts aggregated on daily basis. Records: 731 days575859=========================================60Dataset characteristics61=========================================62Both hour.csv and day.csv have the following fields, except hr which is not available in day.csv6364- instant: record index65- dteday : date66- season : season (1:springer, 2:summer, 3:fall, 4:winter)67- yr : year (0: 2011, 1:2012)68- mnth : month ( 1 to 12)69- hr : hour (0 to 23)70- holiday : weather day is holiday or not (extracted from http://dchr.dc.gov/page/holiday-schedule)71- weekday : day of the week72- workingday : if day is neither weekend nor holiday is 1, otherwise is 0.73+ weathersit :74- 1: Clear, Few clouds, Partly cloudy, Partly cloudy75- 2: Mist + Cloudy, Mist + Broken clouds, Mist + Few clouds, Mist76- 3: Light Snow, Light Rain + Thunderstorm + Scattered clouds, Light Rain + Scattered clouds77- 4: Heavy Rain + Ice Pallets + Thunderstorm + Mist, Snow + Fog78- temp : Normalized temperature in Celsius. The values are divided to 41 (max)79- atemp: Normalized feeling temperature in Celsius. The values are divided to 50 (max)80- hum: Normalized humidity. The values are divided to 100 (max)81- windspeed: Normalized wind speed. The values are divided to 67 (max)82- casual: count of casual users83- registered: count of registered users84- cnt: count of total rental bikes including both casual and registered8586=========================================87License88=========================================89Use of this dataset in publications must be cited to the following publication:9091[1] Fanaee-T, Hadi, and Gama, Joao, "Event labeling combining ensemble detectors and background knowledge", Progress in Artificial Intelligence (2013): pp. 1-15, Springer Berlin Heidelberg, doi:10.1007/s13748-013-0040-3.9293@article{94year={2013},95issn={2192-6352},96journal={Progress in Artificial Intelligence},97doi={10.1007/s13748-013-0040-3},98title={Event labeling combining ensemble detectors and background knowledge},99url={http://dx.doi.org/10.1007/s13748-013-0040-3},100publisher={Springer Berlin Heidelberg},101keywords={Event labeling; Event detection; Ensemble learning; Background knowledge},102author={Fanaee-T, Hadi and Gama, Joao},103pages={1-15}104}105106=========================================107Contact108=========================================109110For further information about this dataset please contact Hadi Fanaee-T ([email protected])111112113