I recently started taking a lot of photos, with the idea that I would post them to my new photo blog. This has worked out well so far, except for the manual process of stripping out the uploaded image URLs and manually placing them into my template.
After a handful of day-long marathon copy-paste sessions, I've rescued myself from this heavily manual process using Python. Now I only need to paste a couple of links into a data file, and the script builds the rest.
Language: Python
Run this script in the same folder as a source file "links.txt".
Odd lines contain the picture's href link
Even lines contain the picture's image src
Similar to:
http://picasaweb.google.com/000000000000000000000/AlbumName
http://lh6.googleusercontent.com/-XXXXXXXXXXX/XXXXXXXXXXX/XXXXXXXXXXX/XXXXXXXXXXX/XXXX/YYYY_MM_DD-XX_SIZE.JPG
'''
import glob, re
filename = "links.txt"
f = open(filename, "r")
flines = f.readlines()
f.close()
count = 0
blogText = ""
blogPost1 = '<div class="separator '
blogPost2 = '" style="clear: both; text-align: center;">\n'
blogPost2 = blogPost2 + '<a href="'
blogPost3 = '" imageanchor="1" style="margin-left:1em; margin-right:1em"><img border="0" height="300" width="450" src="'
blogPost4 = '" /></a></div>\n'
blogPost4 = blogPost4 + '<h2 class="'
blogPost5 = '">Photo title</h2>'
sectionBreak = "<!-- More -->"
# Regular Expressions to pattern match the file name
# ex. 2011_06_13-B.jpg or 2011-07-23-A3.jpg
findB = r"[0-9]{4}[\-_][0-9]{2}[\-_][0-9]{2}[\-_]B[a-zA-Z0-9_\-]{1,6}.JPG"
findNotA = r"[0-9]{4}[\-_][0-9]{2}[\-_][0-9]{2}[\-_][B-Zb-z][a-zA-Z0-9_\-]{1,6}.JPG"
headingCSSClass = "objective"
imageCSSClass = "topPic"
imageHREF = ""
imageSRC = ""
breakText = ""
for line in flines:
count = count + 1
line = line.replace("https://", "http://")
line = line.split()[0]
if count % 2 == 1:
imageHREF = line
else: # count % 2 == 0:
imageSRC = line
#if the line ends with a "B" image, prefix it with a seciton break
matchB = re.search(findB, line)
if matchB:
breakText = sectionBreak
matchNotA = re.search(findNotA, imageSRC)
if matchNotA:
print("Not A: "+imageSRC)
headingCSSClass = "runnerUp"
imageCSSClass = "secondPic"
blogText = blogText + breakText + "\n\n" + blogPost1 + imageCSSClass
blogText = blogText + blogPost2 + imageHREF
blogText = blogText + blogPost3 + imageSRC
blogText = blogText + blogPost4 + headingCSSClass + blogPost5
# resets for the next run.
headingCSSClass = "objective"
imageCSSClass = "topPic"
imageHREF = ""
imageSRC = ""
breakText = ""
print ("BlogText: " + blogText)
The result? In the time it took me to prep a single post's links, I had prepped and named four seperate posts. This has freed up a lot more time to work on the content of the posts rather than processing them.
Analysing the code
I built this quickly, with the intent of making it useful as soon as possible. As a result, there are a few improvements that can be made. For example:
- Take the links data file name as an input
- Write the output to a file, with a manual override for output file name
- Assuming all images are named consistently (they are), the script could re-order pairs of links automatically, so the source file can be unordered but still yeild an ordered post. This would require some rebuilding, but would be more versitile
All in all, the code does its job.
Resizing Images with ImageMagick
A second improvement to the efficiency of posting to my photo blog is the use of a program called ImageMagick. It's a program that allows me to resize and rename my original photos via the command-line. While I don't yet have a script to automate this part, not having to open and resize each photo individually will save me a lot of time. Again, it will increase the time I can spend doing creative things instead of copy-paste drudgery.
- - - - - - - - - -