Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Any regular expressions gurus here?
#1
I want to take a string (a protein sequence) and replace all the instances of "K*" and "R*" (where * is anything) with "K,*" and "R,*" respectively. In other words, every time a K or an R appears, I want to place a comma after it. The exception is any time there is a P following the K or R - then no insert. This is how the enzyme trypsin acts. I'm trying to do this in python, but I'm not very good at regular expressions.

I could iterate through the sequences, but I think regex will be much faster.

For instance:

MVLTIYPDELVQIVSDKIASNRGKITLNQLWDISGKPFDLSDKKVKQFVLSCVILKKDI

MVLTIYPDELVQIVSDK,IASNR,GK,ITLNQLWDISGKPFDLSDK,K,VK,QFVLSCVILK,K,DI

Any help is appreciated. I will keep reading and post back if I figure it out.

Thanks.
Reply


Messages In This Thread
Any regular expressions gurus here? - by volcs0 - 12-29-2008, 08:24 AM

Forum Jump:


Users browsing this thread: 1 Guest(s)