You are here: Home » Resources » Articles » Introduction to Data

Introduction to Data

Introduction to data management, databases and database lingo for startup and volunteer grass-roots environmental organizers and others with little to no database experience.

This article evolved from a talk for volunteer, grass-roots environmentalists with little to no data experience. It may also be useful if you have some database background and want to review your approach to organizational data.

The focus here is on everyday practices to help your group avoid the data-related gridlock that many groups experience, regardless of the software or online platform you use.

As you start to think about data it's most important to keep your organizational goals and activities mind. Then ask:

   1. What data is most useful for our cause? And on the flip side: What data don't we need? Spend the most time creating and maintaining data that is useful. You may invent new ways to track things most important to your cause, while passing on data that other organizations find important or that they keep out of habit.  Spend some good, quiet time to think about what data will really help your cause.

   2. What data is required and desired? By whom? Why? What reports does your organization need or want in order to provide just the right info to: people who care about your issue from all sides, members and donors, board members and volunteers, funders and potential funders, partners and prospective partners, the media including bloggers, your municipality, your state/province or region, and any federal agencies? Required data often has to do with money, for transparency, legitimacy and tax purposes. Make sure the data you spend most time on is about your own top priorities. In addition to basic fiduciary and community responsibility, what info do the people that your group cares about most want and need? How will your data help you identify new sources of support, make more reliable predictions, and tell compelling stories?

   3. How will we create, use and maintain data in ways that are accurate, efficient, effective and appropriate to our cause?  Much more important than the tools you use, like software, online platforms, gadgets, communication services, etc., are the ways you use them.  Accurate, trustworthy data starts with asking the right questions and answering those questions consistently. Asking a question in database land means creating a place, or field, for data to be entered. Answering means data entry, which, if done consistently allows you to include or exclude data for reports later.  Efficient data creation and maintenance means spending just as much time as you need and no more. Time spent planning well generally means less gridlock, wasted time and frustration later. Keeping data current in one accessible place, rather than scattered inconsistently in several systems and formats, requires discipline and may seem slower at any moment, but saves a huge amount of time and frustration later. Effective use of data starts with focus on your goal and strategies to achieve that goal. Know how you expect the data that you generate and keep to help strengthen your strategies. And if all that sounds like a challenge, creating and using appropriate data may be trickiest because living in harmony with nature isn't necessarily about data at all. As you work with data, observe the origin and any cultural values that may be amplified or ignored. Imagine how data might one day tell the story of your mission accomplished.

Now that you're thinking about how data can support your organizational goals and activities, the rest of this article provides language to help you  answer these questions in ways that serve your mission well. Read on!

What is a database?

Database Lingo

More Resources

What is a database?

Databases allow organizers to create, store, find, share and interact with structured information on a larger scale than is possible by word-of-mouth, memory, paper or any other traditional medium. Similarly, spreadsheets like Microsoft Excel and Google Spreadsheets allow for creation, storage, retrieval, sharing and interaction with structured information. Spreadsheets may be enough for a startup effort or for any group run by one person. Or, with good planning, you may want to bypass spreadsheets and start with a database.

Databases are more powerful than spreadsheets or paper because they allow users to quickly create relationships (connections) between records.  Databases also allow for more complex reports (filtered lists) and targeted interactions with groups of constituents and decision-makers.  Therefore, using a database well requires more planning than using spreadsheets.

Even as these powerful tools allow for organization, analysis and interaction at an astonishing scale, the reports and communication they can enable are only as good as your decisions about what data is important, the quality of data you enter, and your choices about who to communicate with when. No database is complete without foresight, maintenance and regular review.

Database Lingo

Common English words like "record" that can mean anything from a world record to a music record to an archival record have very specific meanings for databases:

A backup is a copy of your data kept in a safe place that you only use in case of emergency to replace lost data. Decide how often you want to back up. Weekly or monthly is a good start for grass-roots and smaller groups, depending on how much data entry you do. Use a reminder - anything from writing it in your calendar for the upcoming year or set regular alerts in your primary gadget. If you don't schedule regular backups, make it a habit to back up every time you do major data entry. For example, consider your events over only after attendee data is entered and backed up.

Cleanup (data review) is like cleaning your house or yard. It's not always fun to start but can be relaxing and may help clear your mind. Steps: (1) Save a backup. (2) Delete earlier backups. (3) Sort your data, field by field (column by column), to find and correct misspellings, extra spaces and other oddities. Be careful to keep records (rows) intact when sorting -- make sure to sort the entire table (or worksheet) rather than just one column. (4) Find and merge duplicates manually or by using a deduplication tool. (5) Save and backup.

CRM is a trendy, generic acronym for the kind of databases that typical organizations use to track people. It means "Constituent Relationship Management" system. This name highlights your constituents as people who relate to each other and to your issue, rather than focusing on the system.

.csv files are spreadsheets in comma separated values format, similar to Microsoft Excel spreadsheets, which have the file name extension .xls instead. .csv files are generally simpler and often better for data management tasks like migration from one database to another. You can save an Excel file as a .csv file using File, Save as, and select .csv. .csv files only include one worksheet and no formatting. Also see formatting below.

Data is a collection of meaningful characters entered and stored in database fields to describe people, organizations, things, activities, concepts, etc. Data can be retrieved from the database, changed and reported on. Extra credit: Data is plural. Datum is singular.

Data entry is the process of typing or speaking the information you want to save into your database.  Misspellings, typos, inconsistencies, and creating duplicate records are the most common data entry errors.

Duplicates ("Dups") are two or more records that describe the same thing, often causing errors like miscounts and use of out-of-date contact information. Often, the newer record includes more recent, more accurate data than the older record. However, the older record may include more data or be related to other records in ways you want to preserve. Deduplication is a cleanup process to combine (merge) duplicate records without losing the data you want. Avoid creating duplicate records by searching for an existing record before creating a new one. Clean up your data regularly to find and merge duplicates. Duplicates are one of the most common data problems.

Fields are places to enter limited amounts of data into the database and to hold it there in place so the data can be found again. Database fields generally correspond, for example, to the boxes on paper forms, the columns in spreadsheets, and the boxes you fill in on a web form. A field usually has a label and a place to type in (enter) data. Example: The field labeled "First Name" is usually a rectangle, in which you can type a person's actual first name. Field labels identify what the data in each field is about.

Formatting, for example using colors or font style on a spreadsheet to indicate differences between records (rows) and/or fields (columns), can be ok in some situations where the spreadsheet is used only by one person who keeps a current key (description) to describe what the formatting means so they remember later. However, formatting can't be reported on or preserved when data is migrated to another system - its information value is easily lost. Using additional fields is often more useful instead. For example, rather than highlighting rows in green for people who plan to attend an event, and in red for those who can't, create a new column called RSVP. Then consistently write attending, not attending, not sure yet in that column for each person that indicates their plans. Using fields rather than highlighting makes it much easier then to sort and send specific messages to the right constituents.

Integration between an organization's different information systems, say its web site and database, is more complicated than it seems to users. As organizers, we get the joy and responsibility of learning how information systems integrate with each other behind the scenes.

Mapping is when you describe the paths between two different information systems. The data from a field in an old system should follow the map, or path, when being migrated to another system. Mapping is how to make sure that first names from the old system end up in the first name field of the new system, last names from the old system end up in the last name field of the new system, etc. Data from every field that you want to keep from an old system must have a place to go in the new system. Like many database words, mapping can mean different things in different contexts. Process mapping is something else, see below.

Merging records means combining all the good (correct) data from duplicate records into one record and deleting the other(s).

Migrating data is when you transfer it from one information system to another. Data migration often requires careful analysis and many tries until you get it right. Migration is easier, faster, and more accurate with well-kept data.

A process map is a graphic description of an organizational process. An organizational process is a routine set of tasks or activities. Mapping out organizational processes in flow chart format helps identify inefficiencies that can be streamlined. Process mapping is an important step in database design that helps insure that your database includes all fields and relationships between records that are needed to accomplish your work. Software like Microsoft Visio is often used in larger organizations and by consultants, but is time consuming to use. For small organizations, sketching processes out on paper can be a fine start to help clarify your repeated organizational activities.

Records are a collection of data about one person, organization, activity, or concept, etc. In general, a record is the equivalent of a row of data from a spreadsheet, or from an entire paper form.

Reports are filtered sub-sets of your data. A report is the result after you filter your data to show only those records that meet criteria you set. A good report includes only the records you want, and all of the records you want, showing only the fields you want for those records. Example: Mailing labels report with the First Name, Last Name, Home Street, Home City, Home State/Province, and Home Postal Code fields for all records of people who live in the city of Fairview. A report with more specific criteria would include those same fields for the records of people living in Fairview, Oregon, but not in Fairview Park, Ohio.

A repository is another word for a place, or container, for data, like information system, database and CRM. CRM is a more common word for systems that track people and organizations. Repository tends to be used more for systems that store and track documents, like library systems.

Spaces are characters almost like visible text, we just can't see them. Sometimes spaces act differently than the characters we can see, some systems might ignore spaces sometimes. If an extra space gets entered during data entry it can affect your ability to search or sort. Example: You search for last name Lee, can't find it, but know the record exists. Try *Lee to see if a space got entered before the L. (see Wildcards below to learn about using * in searches)

String is a geeky word for a series of characters, like this sentence.

Structured information is more specific and therefore more retrievable and reportable than free-form information. For example, a free-form note about a contact person's record might say, "only wants to be e-mailed," while the note on another contact person's record might say, "don't send paper mail." A more structured way to track and later use this info about people's communication preferences is to create a Yes/No checkbox field to check for people who do not want paper mail. Then in reports to create mailing lists, anyone with this checkbox field checked can be excluded. Structured information allows for this filtering, while free-form information is more useful for truly unique data that does not need to be reported on.

System administrators set up (customize, implement) an information system. They also troubleshoot and help keep the system in sync with real-life organizational processes.

Tables keep data from running wild like alphabet soup. Tables are like scaffolding that holds up (creates a framework for) database records and fields. Basically, a database table corresponds to one worksheet in an Excel spreadsheet when each column in the spreadsheet is used for one type of data all the way down. The power of databases is their ability to relate (link, or connect) tables to each other to expand the amount and kind of information that can be kept and reported on with regard to any record.

Users are the people who use an information system. Good system administrators like to keep users happy.

A query is a question you ask a database, in the database's own language, in order to view just some of your data. The query acts as a filter, so that your report includes only the records and fields for them that you want to see.

A wildcard character acts as a placeholder when you don't know the actual characters in a string of text. Different information systems may designate different characters as their wildcard, like in card games when "Sevens are wild" or "Jokers are wild."  Use a wildcard character in at least two situations: (1) to search for data you're not sure how to spell and (2) to find existing records that may have typos (data entry errors). The trick with wildcards is to be just specific enough to find what you need without your results returning too many records to be useful. Example: If asterisk (*) is the wildcard for your system and you want to find a record for someone whose first name is Michele, search for Miche*. Your search results will include any existing record that was entered as Michele or Michelle. If you are less specific and only use Mich*to search, your results will include all Michaels, too, which may be too many records to find the one you want at a glance.

More!

Our friends at other organizations also contribute to the wealth of online help and training for non-profit database users. Check out these excellent resources:

Idealware Reports & Articles - Constituent Databases

Progressive Technology Project - Database Resources

Tech Soup Learning Center - Databases

 

Document Actions