Python how to grab certain number of lines after match
Let's say I have an input text file of the following format:
Section1 Heading Number of lines: n1 Line 1 Line 2 ... Line n1 Maybe some irrelevant lines Section2 Heading Number of lines: n2 Line 1 Line 2 ... Line n2
where certain sections of the file start with a header line that specifies how many lines are in that section. Each section heading has a different name.
I have written a regular expression that will match the header line based on the header name the user searches for each section, parse it, and then return the number n1/n2/etc that tells me how many lines are in the section. I have been trying to use a for-in loop to read through each line until a counter reaches n1, but it hasn't worked out so far.
Here's my question: how do I return just a certain number of lines following a matched line when that number is given in the match and different for each section? I'm new to programming, and I appreciate any help.
EDIT: Okay, here's the relevant code that I have so far:
import re print fname = raw_input("Enter filename: ") toolname = raw_input("Enter toolname: ") def findcounter(fname, toolname): logfile = open(fname, "r") pat = 'SUCCESS Number of lines :' #headers all have that format for line in logfile: if toolname in line: if pat in line: s=line pattern = re.compile(r"""(?P<name>.*?) #starting name \s*SUCCESS #whitespace and success \s*Number\s*of\s*lines #whitespace and strings \s*\:\s*(?P<n1>.*)""",re.VERBOSE) match = pattern.match(s) name = match.group("name") n1 = int(match.group("n1")) #after matching line, I attempt to loop through the next n1 lines lcount = 0 for line in logfile: if line == match: while lcount <= n1: match.append(line) lcount += 1 return result
The file itself is pretty long, and there are lots of irrelevant lines interspersed between the sections I'm interested in. What I'm not too sure about is how to specify printing the lines directly after a matched line.
# f is a file object # n1 is how many lines to read lines = [f.readline() for i in range(n1)]
You can put logic like this in a generator:
def take(seq, n): """ gets n items from a sequence """ return [next(seq) for i in range(n)] def getblocks(lines): # `it` is a iterator and knows where we are in the list of lines. it = iter(lines) for line in it: try: # try to find the header: sec, heading, num = line.split() num = int(num) except ValueError: # didnt work, try the next line continue # we got a header, so take the next lines yield take(it, num) #test data = """ Section1 Heading 3 Line 1 Line 2 Line 3 Maybe some irrelevant lines Section2 Heading 2 Line 1 Line 2 """.splitlines() print list(getblocks(data))