Print

Print


Mike remember that " indicates a text string too, so if it is in the actual text string too, then you need to escape the special character with a backslash \

So \"END TREES\"  would be literally evaluated as "END TREES".  Embedding a quote within a quote is confusing in Python, if you were doing a query for "END TREES" (including the quotes) you would have to query for

blahblahblah = "\"END TREES\""

The outer quotes indicate to Python that you are encapsulating a text string, while the escaped ones would pass on as part of the actual text string.

===============================
Michael Smith MS GISP
State GIS Manager, Maine Office of GIS
State of Maine, Office of Information Technology
michael.smith _at_ maine.gov 207-215-5530

Board Member, Maine GeoLibrary
Education Chair, Maine GIS Users Group
State Rep, National States Geographic Information Council
[cid:[log in to unmask]]

State House Station 145
51 Commerce Drive
Augusta, ME 04333-0145
69o 47' 58.9"W  44o 21' 54.8"N
From: Northeast Arc Users Group [mailto:[log in to unmask]] On Behalf Of Lachance, Michael - APHIS
Sent: Thursday, October 02, 2014 10:07 AM
To: [log in to unmask]
Subject: Re: Isolating a block of text between two strings in a text file

Andy,

Yes, there are actually double quotes around each item in the text file.

I am able to get the code to work if I just type in a string of text for s. For example:

import os, re

f = open("200214092341eriswor_1_56.txt", "r")
s = 'TheDoctorLies'
r = re.compile('The(.*)Lies')
m = r.search(s)
print m

Returns "Doctor"

However I get "None" when I assign  file.read() to s and use "TREES" and "END TREES". So I think the problem lies there.

Also, I removed the ? and the code worked when s was set to a simple string of text. Thanks for the tip!

Kind regards,

Michael Lachance
Plant Protection Technician
ALB Eradication Program USDA APHIS
151 West Boylston Drive
Worcester, MA 01606
508.852.8090 (o)
508.414.5673 (c)
The USDA is an equal opportunity provider and employer.
Federal Relay Service (Voice/TTY/ASCII/Spanish) 1-800-877-8339

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.

From: Northeast Arc Users Group [mailto:[log in to unmask]] On Behalf Of Andy Anderson
Sent: Thursday, October 02, 2014 9:57 AM
To: [log in to unmask]<mailto:[log in to unmask]>
Subject: Re: Isolating a block of text between two strings in a text file

Does the text "TREES" and "END TREES" actually have quotes? If not, leave them out of the pattern, because they won't match.

The other thing is that the ? is a modifier, it must have some particular character or set of characters immediately before it, rather than another pattern like .* , which matches anything. So you should just leave ? out of it.

- Andy

On Oct 2, 2014, at 9:15 AM, Lachance, Michael - APHIS <[log in to unmask]<mailto:[log in to unmask]>> wrote:

Good morning all,

Not quite a GIS question, but since you all have proven to be so, so smart in the past, I figured I'd ask here anyways....

I am trying to write a Python script that isolates a block of text within a text file. I want to manipulate the data within that block, but first I need to isolate it.

My research told me to try using something like re.compile and re.search to do so but I haven't been having much luck. I had this block of code:


import os, re

file = open("200214092341eriswor_1_56.txt", "r")
s = file.read()
r = re.compile('"TREES"(.*?)"END TREES"')
m = r.search(s)
if m:
    trees = m.group(1)
    print trees

But it isn't returning anything and if I print m it returns "None".

The data in the text file is separated out my commas, and the words "TREES" and "END TREES" and on their own lines above and below the block I'd like to isolate.

Any thoughts?

Kind regards,

Michael Lachance
Plant Protection Technician
ALB Eradication Program USDA APHIS
151 West Boylston Drive
Worcester, MA 01606
508.852.8050 (o)
508.414.5673 (c)
The USDA is an equal opportunity provider and employer.
Federal Relay Service (Voice/TTY/ASCII/Spanish) 1-800-877-8339

This electronic message contains information generated by the USDA solely for the intended recipients. Any unauthorized interception of this message or the use or disclosure of the information it contains may violate the law and subject the violator to civil or criminal penalties. If you believe you have received this message in error, please notify the sender and delete the email immediately.

------------------------------------------------------------------------- This list (NEARC-L) is an unmoderated discussion list for all NEARC Users.

If you no longer wish to receive e-mail from this list, you can remove yourself by going to http://listserv.uconn.edu/nearc-l.html.

------------------------------------------------------------------------- This list (NEARC-L) is an unmoderated discussion list for all NEARC Users.

If you no longer wish to receive e-mail from this list, you can remove yourself by going to http://listserv.uconn.edu/nearc-l.html.
------------------------------------------------------------------------- This list (NEARC-L) is an unmoderated discussion list for all NEARC Users.

If you no longer wish to receive e-mail from this list, you can remove yourself by going to http://listserv.uconn.edu/nearc-l.html.

------------------------------------------------------------------------- This list (NEARC-L) is an unmoderated discussion list for all NEARC Users.

If you no longer wish to receive e-mail from this list, you can remove yourself by going to http://listserv.uconn.edu/nearc-l.html.