Concatenating Web of Science's Exported Records

For using Sci2 Tool with Web of Science (as described in this helpful guide), I’ve written this script that, with Python 3 installed, should allow you to overcome the limitation of only 500 search results per text file:

import glob, os
NUM_LINES_TO_SKIP_FIRST = 2
NUM_LINES_TO_SKIP_LAST = 2
dirname = 'inputs'
os.chdir(dirname)
filenames = glob.glob("*.txt")
for filename in filenames:
    print(filename)
print('Overwriting output.txt...')
with open('../output.txt', 'w', encoding="utf-8") as outfile:
    for filename in filenames:
        with open(filename, encoding="utf-8") as infile:
            data = infile.read().splitlines(True)
            if filename == filenames[0]: # handle first file, leave header
                print(filename, 'is first in array. Removing footer lines only...')
                for line in data[0:-NUM_LINES_TO_SKIP_LAST]:
                    outfile.write(line)
            elif filename == filenames[-1]: # handle last file, leave footer
                print(filename, 'is last in array. Removing header lines only...')
                for line in data[NUM_LINES_TO_SKIP_FIRST:]:
                    outfile.write(line)
            else: # business as usual, skip first and last two lines
                for line in data[NUM_LINES_TO_SKIP_FIRST:-NUM_LINES_TO_SKIP_LAST]:
                    outfile.write(line)

To use it, just put all your text files from Web of Science (usually named savedrecs.txt, savedrecs(1).txt, etc.) into a folder named “inputs” in the same directory as the script, then run the script, which creates output.txt, the concatenated file.

Written on September 4, 2019