Tuesday, February 16, 2010

Extract Email Address From Text Using Regular Expression in Java

I tried to extract email address from a random text using regular expression. But the extraction is not 100% accurate:( So I also provided a validation method to validate the extracted email address. It is also not very smart, only base on "@". I will try to improve it later.


public String extractEmail(String content) {
String email = null;
String regex = "(\\w+)(\\.\\w+)*@(\\w+\\.)(\\w+)(\\.\\w+)*";
Pattern pattern = Pattern.compile(regex);
Matcher matcher = pattern.matcher(content);
while (matcher.find()) {
email = matcher.group();

if(!isValidEmailAddress(email)){
email=null;
}

break;
}
return email;
}

public boolean isValidEmailAddress(String emailAddress) {
String expression = "^[\\w\\-]([\\.\\w])+[\\w]+@([\\w\\-]+\\.)+[A-Z]{2,4}$";
CharSequence inputStr = emailAddress;
Pattern pattern = Pattern.compile(expression, Pattern.CASE_INSENSITIVE);
Matcher matcher = pattern.matcher(inputStr);
return matcher.matches();

}


4 comments:

Narasimha Swamy said...

super

Nadim said...

Thx a lot!

JustSomeGuy said...

This seems to fail on the case where there is a dash '-' contained in the name like:

bob-smith@company.com OR
bob@some-company.com

Java Srilankan Support said...

It is very easy to validate email address using Java Mail API

How to validate email address using Java Mail API