latin bases language segmentation gramatical rules

Posted by pravin on Stack Overflow See other posts from Stack Overflow or by pravin
Published on 2010-05-12T06:10:18Z Indexed on 2010/05/12 6:14 UTC
Read the original article Hit count: 238

Filed under:
|
|

Hi folks,

I am working on one feature i.e. to apply language segmentation rules ( grammatical ) for Latin based language ( English currently ).

Currently I am in phase of breaking sentences of user input.

e.g.:

"I am working in language translation". "I have used Google MT API for this"

In above example i will break above sentence by full stop (.) This is normal cases where I am breaking sentence on dot, but there are n number of characters for breaking sentence like ( . ! ? etc ).

I have following SRX rules for segmentation.

Here my question are :-

1) Is there any reference ? which I can use for resolving my language segmentation rules.

2) Or Is there any forums on language segmentation ? , so that i can discuss efficiently

Please let me know if anybody know about this ?

Thanks a lot.

© Stack Overflow or respective owner

Related posts about php

Related posts about JavaScript