PHP Regex Help для синтаксического анализа строки

У меня есть строка, такая как:

Are you looking for a quality real estate company? <s>Josh's real estate firm specializes in helping people find homes from [city][State].</s> <s>Josh's real estate company is a boutique real estate firm serving clients locally.</s> In [city][state] I am sure you know how difficult it is to find a great home, but we work closely with you to give you exactly what you need 

Я хотел бы, чтобы этот абзац разбился на массив на основе тегов <s> </s> , поэтому в результате я получил следующий массив:

 [0] Are you looking for a quality real estate company? [1] Josh's real estate firm specializes in helping people find homes from [city][State]. [2] Josh's real estate company is a boutique real estate firm serving clients locally. [3] In [city][state] I am sure you know how difficult it is to find a great home, but we work closely with you to give you exactly what you need на [0] Are you looking for a quality real estate company? [1] Josh's real estate firm specializes in helping people find homes from [city][State]. [2] Josh's real estate company is a boutique real estate firm serving clients locally. [3] In [city][state] I am sure you know how difficult it is to find a great home, but we work closely with you to give you exactly what you need 

Это регулярное выражение, которое я использую сейчас:

 $matches = array(); preg_match_all(":<s>(.*?)</s>:is", $string, $matches); $result = $matches[1]; print_r($result); 

Но этот только возвращает массив, содержащий текст, найденный между тегами <s> </s> , он игнорирует текст, найденный до и после этих тегов. (В приведенном выше примере он будет возвращать только элементы массива 1 и 2.

Есть идеи?

Самое близкое, что я мог получить, это использовать вместо preg_split() :

 $string = <<< STR Are you looking for a quality real estate company? <s>Josh's real estate firm specializes in helping people find homes from [city][State].</s> <s>Josh's real estate company is a boutique real estate firm serving clients locally.</s> In [city][state] I am sure you know how difficult it is to find a great home, but we work closely with you to give you exactly what you need STR; print_r(preg_split(':</?s>:is', $string)); 

И получил этот результат:

 Array ( [0] => Are you looking for a quality real estate company? [1] => Josh's real estate firm specializes in helping people find homes from [city][State]. [2] => [3] => Josh's real estate company is a boutique real estate firm serving clients locally. [4] => In [city][state] I am sure you know how difficult it is to find a great home, but we work closely with you to give you exactly what you need ) на Array ( [0] => Are you looking for a quality real estate company? [1] => Josh's real estate firm specializes in helping people find homes from [city][State]. [2] => [3] => Josh's real estate company is a boutique real estate firm serving clients locally. [4] => In [city][state] I am sure you know how difficult it is to find a great home, but we work closely with you to give you exactly what you need ) 

За исключением того, что создается дополнительный элемент массива (индекс 2 ), где между фрагментами [city][State].</s> есть новая строка [city][State].</s> и <s>Josh's real estate company .

Было бы тривиально добавить некоторый код, чтобы удалить пробелы, но я не уверен, если вы этого хотите.

Предлагаю вам посмотреть DOM http://php.net/manual/en/book.dom.php