MySQL for Python : Simple Insertion - A command-line insertion utility (part 1) - The necessary modules

6/14/2013 7:37:34 PM

1. Essentials: close and commit

In programs that interface with multiple databases or otherwise persist beyond the database connection that you have initiated, you will find a need to use a couple of MySQL commands that we have not yet discussed: close and commit.

In need of some closure

When one is finished with a database, it is good practice to close the cursor proxy. This ensures the cursor is not used again to refer to that database connection and also frees up resources. To close a cursor connection in MySQL for Python, simply issue the method call to your cursor object:

cur.close()

What happened to commit?

If you are experienced with using the MySQL shell or perhaps programming interfaces with MySQL using different APIs, you may wonder what has happened to the commit call that one normally would make at the end of every transaction to render changes permanent.

MySQL for Python ships with an autocommit feature. Therefore, when the connection is closed, the changes are committed. However, if you are programming to several databases and want to ensure one is closed before another is opened, MySQL for Python still supports a commit() function. You simply call it with the handle of the database.

mydb.commit()

After committing the changes to the database, one typically closes the database connection. To do this, use the database object's close() method:

mydb.close()

Why are these essentials non-essential?

Unless you are running several database threads at a time or have to deal with similar complexity, MySQL for Python does not require you to use either commit() or close(). Generally speaking, MySQL for Python installs with an autocommit feature switched on. It thus takes care of committing the changes for you when the cursor object is destroyed.

Similarly, when the program terminates, Python tends to close the cursor and database connection as it destroys both objects.

2. A command-line insertion utility

For this project we want to create a program with the following functionality:

Runs from the command-line
Uses a flag system allowing for the -h flag for help
Allows the user to define the database being used
Allows the user to designate which user and password combination to use
Allows the user to ask for the tables available in a given database
Provides the user with the column structure of the table on demand
Validates user input for the given table of the selected database
Builds the database INSERT statement on-the-fly
Inserts the user input into the chosen table of the selected database

2.1 The necessary modules

Before we jump into coding, let us first assess which modules we need to import. The modules we need are listed next to our required functionality as follows. The need for MySQLdb is understood.

Flag system: optparse
Login details: getpass
Build the INSERT statement: string

Our import statement thus looks like this:

import getpass, MySQLdb, optparse, string

In addition to these, we will also use the PrettyTable module to provide the user with the column structure of the table in a neat format. This module is not part of the standard library, but is easily installed using the following invocation from the command-line:

easy_install prettytable

If you prefer not to install PrettyTable, you will obviously want to modify the code according to your preferences when we get to printing out the database definition table to the user.

The main() thing

In this project, we will have several functions. In order to ensure that we can call those functions from other programs, we will code this project using a main function. The other functions will be inserted before the main() function in the program, but starting with the main() function and coding the others as needed helps us to keep from losing the plot of the program. So let's define main():

def main():

In any size of program, using a main() function is good practice and results in a high degree of readability. Ideally, main() should be among the smallest of the functions in a program. The point is that main() should be the brains of the program that coordinates the activity of the classes and functions.

From here, the flow of the main() function will follow this logic:

1. Set up the flag system.
2. Test the values passed by the user.
3. Try to establish a database connection.
4. If successful, show the user the tables of the designated database.
5. Offer to show the table structure, and then do so.
6. Accept user input for the INSERT statement, column-by-column.
7. Build the INSERT statement from the user input and execute it.
8. Print the INSERT statement to the user for feedback.
9. Commit changes and close all connections.

Coding the flag system

We need to tell Python which flags should be supported and to which variables the values should be assigned. The code looks like this:

opt = optparse.OptionParser()
opt.add_option("-d", "--database", action="store", type="string", dest="database")
opt.add_option("-p", "--passwd", action="store", type="string", dest="passwd")
opt.add_option("-u", "--user", action="store", type="string", dest="user")
opt, args = opt.parse_args()

If you don't understand this. For simplicity's sake, we then pass the values to simpler variable names:

database = opt.database
passwd = opt.passwd
user = opt.user

If you have trouble with the program after you code it, here is a point for blackboxing. Simply insert the following loop to show what the computer is thinking at this point:

for i in (database, passwd, user): print "'%s'" %(i)

Blackboxing is jargon in the IT industry and simply means to isolate the parts of a problem so that each piece can be tested separately of the others. With this for loop, we can ensure that Python has properly assimilated the flagged input from the user.

Testing the values passed by the user

Next, we need to ensure that the user has not passed us empty or no data. If the user has, we need to ask for a new value.

while (user == "") or (user == None):
print "This system is secured against anonymous logins."
user = getpass.getuser()
while (passwd == "") or (passwd == None):
print "You must have a valid password to log into the database."
passwd = getpass.getpass()
while (database == "") or (database == None):
database = raw_input("We need the name of an existing database to proceed. Please enter it here: ")

Note that we are not using if. If we had, we would have needed to set up a loop to consistently check the value of the data. Using while saves us the trouble.

Try to establish a database connection

Having checked the login data, we can now attempt a connection. Just because the user data has checked out does not mean that the data is valid. It merely means that it fits with our expectations. The data is not valid until the database connection is made. Until then, there is a chance of failure. We therefore should use a try...except... structure.

try:
mydb = MySQLdb.connect(host = 'localhost',
user = user,
passwd = passwd,
db = database)
cur = mydb.cursor()
quit = 1
except:
print "The login credentials you entered are not valid for the database you indicated. Please check your login details and try again."
quit = 0

Here we use quit as a token to indicate the success of the connection. One could just as easily use connected or is_connected. A successful connection is not made until a cursor object is created.

Within the except clause, it is important to tell the user why the program is going to terminate. Otherwise, he or she is left in the dark, and the program can effectively become useless to them.

Showing the tables

Next, we cull out the tables from the database and show them to the user. We only do this if a successful connection has been made.

if quit == 1:
get_tables_statement = """SHOW TABLES"""
cur.execute(get_tables_statement)
tables = cur.fetchall()
print "The tables available for database %s follow below:" %(database)
for i in xrange(0, len(tables)):
print "%s. %s" %(i+1, tables[i])
table_choice = raw_input("Please enter the number of the table into which you would like to insert data. ")

For the sake of formatting, we increment the number of the table by one in order to use the natural number system when presenting the options to the user.

Upon receiving the number of table_choice from the user, we must validate it. To do so, we stringify the number and pass it to a function valid_table(), which we will create later in the development process. For now, it is enough to know that the function needs the user's choice and the number of tables in the designated database. For simplicity, we pass the list of tables.

table_choice = str(table_choice)
table_no = valid_table(table_choice, tables)

Once the number chosen is validated, we must decrement the number to synchronise it with the whole number system used by Python.

table = tables[table_no-1][0]

Showing the table structure, if desired

The next step is to show the user the data structure of the table, if desired. We affect this with a raw_input statement and an if... clause:

show_def = raw_input("Would you like to see the database structure of the table '%s'? (y/n) " %(table))

Before launching into the if... statement, , we can economize on our code. Regardless of whether the user wants to see the table format, we will need the column headers later to affect the insertion. We can take care of retrieving them now so that the information is available in the if... statement as well as out, both for the price of one MySQL statement.

def_statement = """DESCRIBE %s""" %(table)
cur.execute(def_statement)
definition = cur.fetchall()

If the user chooses y to the input at show_def, then we run the following if loop:

if show_def == "y":
from prettytable import PrettyTable
tabledef = PrettyTable()
tabledef.set_field_names(["Field", "Type", "Null", "Key", "Default", "Extra"])
for j in xrange(0, len(definition)):
tabledef.add_row([definition[j][0], definition[j][1], definition[j][2], definition[j][3], definition[j][4], definition[j][5]])
tabledef.printt()

As mentioned when discussing the modules for this project, here we import PrettyTable from the module prettytable. This merely allows us to output a nicely formatted table similar to MySQL's own. It is not required for the program to work as long as you convey the value of the six tabular fields for each row.

Note that, if show_def equals anything other than a simple y, the if loop will not execute.

Accepting user input for the INSERT statement

We next need to ask the user for the values to be inserted. To guide the user, we will prompt them for the value of each column in turn:

print "Please enter the data you would like to insert into table %s" %(table)
columns = []
values = []
for j in xrange(0, len(definition)):
column = definition[j][0]
value = raw_input("Value to insert for column '%s'?" %(definition[j][0]))
columns.append(str(column))
values.append('"' + str(value) + '"')
columns = ','.join(columns)
values = ','.join(values)
print columns
print values

The lists columns and values obviously correspond to the respective parts of the MySQL INSERT statement. It is important to remember that the column headers in a MySQL statement are not set in quotes, but that the values are. Therefore, we must format the two lists differently. In either case, however, the items need to be separated by commas. This will make it easier when building the next INSERT statement.

If you encounter difficulty in coding this project, this is another good point for blackboxing. Simply print the value of each list after the close of the for loop to see the value of each at this point.

Building the INSERT statement from the user input and executing it

Having the user's values for insertion, we are now at a point where we can build the MySQL statement. We do this with string formatting characters.

statement = """INSERT INTO %s(%s) VALUES(%s)""" %(table, columns, values)

We then execute the statement. For extra security against malformed data, you could couch this in a try...except... structure.

cur.execute(statement)

If the execute statement was processed without a problem, it is a good idea to give the user some feedback. An appropriate output here would be the statement that was processed.

print "Data has been inserted using the following statement: \n", statement

Committing changes and closing the connection

Finally, we can commit the changes and close the connection. This is not typically necessary for a single run database program such as this, but it is not a bad habit to maintain.

cur.close()
mydb.commit()
mydb.close()

It is worth noting that committing before closing is not wholly necessary. The one implies the other. However, commit() allows us to commit changes to the database without closing the database connection, so we can commit changes at regular intervals.

Others