Sixth researcher

Another Bioinformatics website

Category: Python

When students outperform their teacher – Part 1

This post is a tribute to the work of my students in the subject “Basic Python Programming for Scientists, as they did a great job in their final projects I want to show you here some examples that could be useful for many Python beginners. Remember that you can download the course slides from Didactic Materials sections.

First of all, I want so say that it was my first experience as Python Programming teacher and I was gratefully surprised with my students. I have learned a lot with them and I can say that some of them outperformed the teacher. I also want to thank to Medhat Helmy and Piotr Bentkowski for helping me in the teaching task.  Unfortunately I cannot have the same opinion from the Adam Mickiewicz University that didn’t pay me any money for teaching, welcome to Poland…

Today I’ll show you 2 projects that overcame all my expectations: Image Quiz from Błażej Szymański and Temperature converter by Michal Stachowiak.. In next days I’ll publish more incredible projects.


Image Quiz

Click to play Image Quiz online

Click to play Image Quiz online

Maybe the project that impressed me the most was ‘Image Quiz’ by Błażej Szymański. He was planning at the beginning to write a python script that automatically downloads different airplane model images from Google, and later use these images with an HTML+JavaScript interface to play a quiz to guess which is the correct airplane model shown in a randomly selected image. I suggested some minor changes, like using a more biological topic as bird species instead of airplanes. And now you can check yourself the final result playing the Image Quiz online!

The project consists basically in 4 python scripts:

  • fetcher.py: reads a file ‘list.txt’ containing the names of the birds to look for and use the Google API to retrieve the bird photo URLs that are stored as JSON format in ‘data.txt’.
  • saver.py: retrieves the image files and saves them at ‘pictures’ folder using their URLs stored in ‘data.txt’.
  • resizer.py: checks if the images are larger than 800px, if so they are resized.
  • final.py: saves a dictionary with bird names as keys and arrays with the image locations as values in JSON format into ‘result.txt’. This JSON file will be used later by a JavaScript function to prepare the quiz.

fetcher.py:

import requests
import json
import os

result = []
question = []
with open("list.txt","r") as file:
    for line in file:
        question.append(line[:-1])

for q in question:
    stuff = {"key":"WRITE HERE YOUR GOOGLE API KEY",
             "num":10,
             "searchType":"image",
             "q":q,
             "cx":"003535310068073691339:ebzdlw5y6gq",
             "excludeTerms":"stock" #to get rid of stock images
             }
    j = requests.get("https://www.googleapis.com/customsearch/v1", params=stuff)
    r = json.loads(j.text)
    x = []
    for each in r["items"]:
        x.append(each["link"])
    result.append(x)
        
dictionary = dict(zip(question,result))

with open('data.txt', 'w') as resultfile:
    json.dump(dictionary, resultfile)

saver.py:

import os
import requests
import shutil
import json

with open("data.txt","r") as data:
    dictionary = json.load(data)

for key in dictionary.keys():
    
    for index, each in enumerate(dictionary[key]):
        r = requests.get(each, stream=True)
        folder = "./pictures/{pic}".format(pic=key)
        os.makedirs(folder, exist_ok=True)
        print("Retrieving image {} from {}".format(index+1,key))
        

        with open(folder + "/" + str(index) + ".jpeg", "wb") as file:
            shutil.copyfileobj(r.raw, file)

print("\nPlease review downloaded pictures and delete invalid ones.")

resizer.py:

import os
from PIL import Image

for thing in os.listdir("./pictures"):
    for model in os.listdir("./pictures/" + thing):
        if model == "Thumbs.db": #to avoid windows thumbnails
            continue
        im = Image.open("./pictures/" + thing + "/" + model)
        
        if im.size[1] > 800:
            
            ratio = im.size[0]/im.size[1]
            x = ((round(ratio*800)),800)
            print(model,"resized to",x)
            im = im.resize(x,Image.ANTIALIAS)
            im.save("./pictures/" + thing + "/" + model)

final.py:

import os
import json

keys = []
temp = []
values = []

for thing in os.listdir("./pictures"):
    keys.append(thing)

    for file in os.listdir("./pictures"+"/"+thing):
        if file != "Thumbs.db":
            temp.append("./pictures"+"/"+thing+"/"+file)
    values.append(temp)
    temp=[]
            
dictionary = dict(zip(keys,values))

with open("result.txt", "w") as result:
    json.dump(dictionary, result)

Temperature converter

‘Temperature converter’ by Michal Stachowiak, is a script that converts temperature values in different scales and saves the results in memory.

He implemented temperature conversions between Kelvin, Celsius and Fahrenheit scales together with a database to store and modify the results. Here I’ll show a reduced version of the script that only converts from Celsius to Kelvin.

#################################
#FUNCTIONS FOR CONVERTING

def C_on_K (C):
    K = C + 273.15
    return K

########################################
#MAIN PROGRAM

print("\n\n")
print("       ===========================")
print("          Temperature Converter")
print("       ===========================\n\n")

print("Input and output are stored in a database\n\n")

gearbox = 0
index = 0 
while gearbox < 100: #You can execute this program 100 times print("Choose the converter:") print("----------------------") print(" 1. C => K")
    print("----------------------")
    print("Chose the action:")
    print("----------------------")
    print(" 7. Show all records")
    print(" 8. Show specific record")
    print(" 9. Edit specific record")
    print("10. Delete all records")
    print("11. Delete specific record")
    print("12. Close the program")
    print("----------------------")
    
    print("\nChoice 1-12:")
    answer = input()
    str(answer)
        
######################################################   
 
    if answer == "1":
        controler = 1
        while controler == 1:
            print ("Provide the temperature in Celsius degree:")
            C = float(input())
            if C < -273: #there is no C temp below that value, so ask again for proper temperature
                print ("There is no temperature in C degree below -273")
                print ("Please provide proper C temperature")
                controler = 1
                
            else:        
                equation5 = C_on_K(C)
                print("=========================")
                print( "{:.2f}C degree is {:.2f}K".format(C,equation5))
                print("=========================")
                controler = 0

        gearbox += 1
        print("Press any key to continue...")
        input()    

    elif answer == "12":
        print("Thank You for using Temperature Converter")
        print("If You enjoy the program, I appreciate any donations")
        print("Press any key to quit...")
        input()
        sys.exit(0)

    else:
        print("Please provide number from 1 to 12")
        gearbox += 1


Python 3 for scientists course and other didactic materials at SixthResearcher

I want to share with you the new section of Didactic Materials at the Sixth Researcher website.

In this section I will include courses, presentations, workshops and other materials that I prepared and could be useful for other researchers.

At the moment you can download 10 lessons with the fundamentals of Python 3 for biologists and other scientists that I imparted at UAM.

Also is available a Metabarcoding workshop with the fundamentals of the technique and a practical example explaining the bioinformatics analysis of the data.

The didactic materials in this new section will be licensed as Creative Commons Attribution-NonCommercial.


Python for Scientists

Materials from the course Basic Python 3 Programming for Scientists imparted at Adam Mickiewicz University:


Metabarcoding Bioinformatics analysis

Materials from the workshop Introduction to Bioinformatics analysis of Metabarcoding data:


 

Counting blue and white bacteria colonies with Python and OpenCV

Last week I was teaching to my Polish students how to use the Python packages NumPy, SciPy and Matplotlib for scientific computing. Most of the examples were about numerical calculations, data visualization and generation of graphs and figures. If you are interested in the topic check the following links with nice tutorials and examples for NumPy, SciPy and Matplotlib.

At the end of the lesson we also explored the capabilities of the scipy.ndimage module for image processing as shown in this nice tutorial. After all, images are pixel matrices that may be represented as NumPy arrays.

After lesson my curiosity led me to OpenCV (Open Source Computer Vision Library), an open-source library for computer vision that includes several hundreds of algorithms.

It is highly recommended to install the last OpenCV version, but you should compile the code yourself as explained here. To use OpenCV in Python, just install its wrapper with PIP installer: pip install opencv-python and import it in any script as: import cv2. In this way you will be able to use any algorithm from OpenCV as Python native but in the background they will be executed as C/C++ code that will make image processing must faster.

After the technical introduction, let’s go to the interesting stuff…

Figure 1. Original blue and white bacteria colonies in Petri dish.

Let’s imagine that we are working at the lab trying to optimize a new cloning protocol. We have dozens of Petri dish with transformed bacteria and we want to check and quantify the transformation efficiency. Each Petri dish will contain blue and white bacteria colonies, white ones will be the bacteria that have inserted our vector disrupting the lacZ gene that generates the blue color.

We want to take photos of the Petri dishes, transfer them to the computer and use a Python script to count automatically the number of blue and white colonies.

For example, we will analyze one image (Figure 1: ‘colonies01.png’) running the following command:

> python bacteria_counter.py -i colonies01.png -o colonies01.out.png
   30 blue colonies
   17 white colonies

Figure 2. Blue and white bacteria colonies are marked in red and green colors respectively as recognized by the Python script.

It will print the number of colonies of each type (blue and white) and it will create an output image (Figure 2: ‘colonies01.out.png’) where blue colonies are marked in red color and white ones in green.

Before showing the code I’ll do a few remarks:

  • Code works quite well but it is not perfect, it fails to recognize 2 small white colonies and also groups other 2 small colonies with 2 bigger ones of the same color. Finally, the script counts 30 blue and 17 white colonies.
  • One of the most tricky parts of the code are the color boundaries to recognize blue and white spots. These thresholds have been manually adjusted (with Photoshop help) before the analysis and they could change with different camera illumination conditions.
  • White colonies are more difficult to identify because their colors are grayish and similar colors are in the blue colonies edges and background. For that reason, image colors are inverted previously to white colonies analysis for an easier recognition.
  • It’s not AI (Artificial Intelligence). I’d call it better ‘Human Intelligence’ because the recognition algorithm thresholds are manually trained. If you are interested in AI and image recognition I can suggest to read the recent article in Nature about skin cancer classification with deep neural networks.
  • I wanted to show a scientific application of image processing, but many other examples are available in internet: recognizing Messi face in a photo, classifying Game Boy cartridges by color

And here is the commented code that performs the magic:

# -*- coding: utf-8 -*-
"""
Bacteria counter

    Counts blue and white bacteria on a Petri dish

    python bacteria_counter.py -i [imagefile] -o [imagefile]

@author: Alvaro Sebastian (www.sixthresearcher.com)
"""

# import the necessary packages
import numpy as np
import argparse
import imutils
import cv2
 
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-i", "--image", required=True,
    help="path to the input image")
ap.add_argument("-o", "--output", required=True,
    help="path to the output image")
args = vars(ap.parse_args())
 
# dict to count colonies
counter = {}

# load the image
image_orig = cv2.imread(args["image"])
height_orig, width_orig = image_orig.shape[:2]

# output image with contours
image_contours = image_orig.copy()

# DETECTING BLUE AND WHITE COLONIES
colors = ['blue', 'white']
for color in colors:

    # copy of original image
    image_to_process = image_orig.copy()

    # initializes counter
    counter[color] = 0

    # define NumPy arrays of color boundaries (GBR vectors)
    if color == 'blue':
        lower = np.array([ 60, 100,  20])
        upper = np.array([170, 180, 150])
    elif color == 'white':
        # invert image colors
        image_to_process = (255-image_to_process)
        lower = np.array([ 50,  50,  40])
        upper = np.array([100, 120,  80])

    # find the colors within the specified boundaries
    image_mask = cv2.inRange(image_to_process, lower, upper)
    # apply the mask
    image_res = cv2.bitwise_and(image_to_process, image_to_process, mask = image_mask)

    ## load the image, convert it to grayscale, and blur it slightly
    image_gray = cv2.cvtColor(image_res, cv2.COLOR_BGR2GRAY)
    image_gray = cv2.GaussianBlur(image_gray, (5, 5), 0)

    # perform edge detection, then perform a dilation + erosion to close gaps in between object edges
    image_edged = cv2.Canny(image_gray, 50, 100)
    image_edged = cv2.dilate(image_edged, None, iterations=1)
    image_edged = cv2.erode(image_edged, None, iterations=1)

    # find contours in the edge map
    cnts = cv2.findContours(image_edged.copy(), cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
    cnts = cnts[0] if imutils.is_cv2() else cnts[1]

    # loop over the contours individually
    for c in cnts:
        
        # if the contour is not sufficiently large, ignore it
        if cv2.contourArea(c) < 5:
            continue
        
        # compute the Convex Hull of the contour
        hull = cv2.convexHull(c)
        if color == 'blue':
            # prints contours in red color
            cv2.drawContours(image_contours,[hull],0,(0,0,255),1)
        elif color == 'white':
            # prints contours in green color
            cv2.drawContours(image_contours,[hull],0,(0,255,0),1)

        counter[color] += 1
        #cv2.putText(image_contours, "{:.0f}".format(cv2.contourArea(c)), (int(hull[0][0][0]), int(hull[0][0][1])), cv2.FONT_HERSHEY_SIMPLEX, 0.65, (255, 255, 255), 2)

    # Print the number of colonies of each color
    print("{} {} colonies".format(counter[color],color))

# Writes the output image
cv2.imwrite(args["output"],image_contours)

© 2018 Sixth researcher

Theme by Anders NorenUp ↑