When you are seeking data, you will find that it comes in many forms and formats. For example, look at OpenDataPhilly. You want to know a bit about some of the most common formats you may encounter.
XLS (Microsoft Excel spreadsheet) – Common spreadsheet software format. If you can find data in .xls format, it will save you a lot of time and effort.
CSV (comma-separated values) – Stores data in plain text form. Google Sheets saves files as CSV. Can often be opened easily in Excel or Google Spreadsheet.
HTML – a web page written in hypertext markup language.
XML (Extensible Markup Language) – A markup language with a heavily structured format. Commonly used to describe items in a database.
PDF (Portable Document Format) – A flat, fixed-layout format of text, fonts, graphics. PDFs are nice when sharing files because they are standard, small-sized files that can be easily opened. However, the format makes it difficult to extract information.
shp – Shapefile – commonly used for geography, maps, and graphics.
api (application programming interface) – Programming instructions and standards for accessing a Web-based software application or Web tool. A software company releases its API to the public so that other software developers can design products that are powered by its service.
You are always looking for the easiest and simplest way to get the data into a spreadsheet so you can work with it. For example, if you want to look at Rowan University crime stats (which you will later in the semester), you will find they are posted as a pdf, in html on a web page, and as .csv files.