Discussion:
How can I sort paragraphs of text?
(too old to reply)
Chris Gianetto
2006-07-31 15:54:01 UTC
Permalink
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.

ITEM ID. URO 001
Sentence of text. Sentence of text:
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
John Nurick
2006-07-31 20:54:33 UTC
Permalink
I'd export the stuff to a text file; after that it just takes a few
lines of Perl:

$/ = ""; #assumes items are separated by blank lines

my %items;

while (<>) { #read items from file
#add item to hash, with classification as key
$items{$1} = $_ if m/^Classification:\s(\d+)/m ;
}
print "$items{$_}" foreach (sort keys %items);





On Mon, 31 Jul 2006 08:54:01 -0700, Chris Gianetto <Chris
Post by Chris Gianetto
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.
ITEM ID. URO 001
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.
Chris Gianetto
2006-08-01 13:25:02 UTC
Permalink
Thanks for the reply but unfortunately I do not have access to Perl. I'm
trying to create a macro using VBA code that will preform the desired
functions. The macro must sort all paragraphs within a *.doc document into
ascending order. Any ideas for this method woud be greatly appreciated.

Thanks,

Chris Gianetto
Post by John Nurick
I'd export the stuff to a text file; after that it just takes a few
$/ = ""; #assumes items are separated by blank lines
my %items;
while (<>) { #read items from file
#add item to hash, with classification as key
$items{$1} = $_ if m/^Classification:\s(\d+)/m ;
}
print "$items{$_}" foreach (sort keys %items);
On Mon, 31 Jul 2006 08:54:01 -0700, Chris Gianetto <Chris
Post by Chris Gianetto
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.
ITEM ID. URO 001
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
--
John Nurick [Microsoft Access MVP]
Please respond in the newgroup and not by email.
Helmut Weber
2006-08-01 20:21:46 UTC
Permalink
Hi Chris,

for sorting paragraphs in alphabetical(!) order,
regardless of formatting:

Sub PrimitiveSort()
Dim i As Long
Dim j As Long
Dim k As Long
Dim aBuf As String
With ActiveDocument.Paragraphs
k = .Count
For j = 1 To k - 1
For i = (j + 1) To (k - 1)
If .Item(i).Range.Text < .Item(j).Range.Text Then
aBuf = .Item(j).Range.Text
.Item(j).Range.Text = .Item(i).Range.Text
.Item(i).Range.Text = aBuf
End If
Next i
Next j
End With
End Sub
The number of words between the header of the paragraph
and the classification header is inconsistent.
Well, I didn't understand all,
but if some input is inconsistent,
then computers get into difficulties in general.

I think I've once posted some code
which takes care of basic formatting, too.

Can't find it anymore.

For sorting, its a science of it's own.
My sample is the most basic possible (bubble sort).
--
Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Win XP, Office 2003
"red.sys" & Chr$(64) & "t-online.de"
Chris Gianetto
2006-08-01 21:18:02 UTC
Permalink
Thanks Helmut. Just to clairfy further what my objective is. I'm trying to
sort the paragraphs with reference to a single line of text that contains a
value. Please refer to my initial post as I put a sample paragraph, the
actual document contains 200+ paragraphs thus a macro was the desired tool.
So sorting with reference to a value contained within the paragraph is the
goal.

Thanks,

Chris Gianetto
Post by Helmut Weber
Hi Chris,
for sorting paragraphs in alphabetical(!) order,
Sub PrimitiveSort()
Dim i As Long
Dim j As Long
Dim k As Long
Dim aBuf As String
With ActiveDocument.Paragraphs
k = .Count
For j = 1 To k - 1
For i = (j + 1) To (k - 1)
If .Item(i).Range.Text < .Item(j).Range.Text Then
aBuf = .Item(j).Range.Text
.Item(j).Range.Text = .Item(i).Range.Text
.Item(i).Range.Text = aBuf
End If
Next i
Next j
End With
End Sub
The number of words between the header of the paragraph
and the classification header is inconsistent.
Well, I didn't understand all,
but if some input is inconsistent,
then computers get into difficulties in general.
I think I've once posted some code
which takes care of basic formatting, too.
Can't find it anymore.
For sorting, its a science of it's own.
My sample is the most basic possible (bubble sort).
--
Greetings from Bavaria, Germany
Helmut Weber, MVP WordVBA
Win XP, Office 2003
"red.sys" & Chr$(64) & "t-online.de"
Helmut Weber
2006-08-01 21:36:47 UTC
Permalink
Hi Chris,
Post by Chris Gianetto
sorting with reference to a value
contained within the paragraph is the goal.
not a problem, leaving formatting aside at first.
but how to locate the value?

If the "value" and the text preceding it,
was typed in by humans, as I assume, you got no chance.
--
Greetings from Bavaria, Germany

Helmut Weber, MVP WordVBA

Win XP, Office 2003
"red.sys" & Chr$(64) & "t-online.de"
John Nurick
2006-08-01 21:14:52 UTC
Permalink
Well, you could use the same general approach in Word VBA. Pseudocode:

For each item (your "paragraph" looks as if it contains several
Word paragraphs)
Extract the Classification value (using a wildcard
search or otherwise)
Add the text to a VBScript Dictionary object, using the
Classification value as the key
Next item
Sort the Classification values
For each sorted Classification value
Insert the corresponding item from the Dictionary
Next Classification

Another approach, which may be simpler if you have different formatting
within your items:

Turn your items into a one-column table with one item per row.
Add a column to the table.
For each row, extract the Classification value and place
it in the empty cell (in the new column)
Sort the table on that column (using Word's built-in
sorting facilities)
Delete the column and convert the table back to text.

If your "paragraphs" really are single Word paragraphs, it's simpler:
extract the Classification number and stick a copy of it on the front of
each paragraph, separated with a tab, e.g.

08<tab>ITEM ID. URO 001
Sentence of text....

Then sort the paragraphs using the built-in sort facility; finally use a
wildcard replace to delete the prepended Classification<tab> from each
paragraph.

On Tue, 1 Aug 2006 06:25:02 -0700, Chris Gianetto
Post by Chris Gianetto
Thanks for the reply but unfortunately I do not have access to Perl. I'm
trying to create a macro using VBA code that will preform the desired
functions. The macro must sort all paragraphs within a *.doc document into
ascending order. Any ideas for this method woud be greatly appreciated.
Thanks,
Chris Gianetto
Post by John Nurick
I'd export the stuff to a text file; after that it just takes a few
$/ = ""; #assumes items are separated by blank lines
my %items;
while (<>) { #read items from file
#add item to hash, with classification as key
$items{$1} = $_ if m/^Classification:\s(\d+)/m ;
}
print "$items{$_}" foreach (sort keys %items);
On Mon, 31 Jul 2006 08:54:01 -0700, Chris Gianetto <Chris
Post by Chris Gianetto
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.
ITEM ID. URO 001
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
--
John Nurick [Microsoft Access MVP]
Please respond in the newgroup and not by email.
--
John Nurick [Microsoft Access MVP]

Please respond in the newgroup and not by email.
Russ
2006-08-06 09:09:49 UTC
Permalink
Chris,

If that is truly one paragraph with only one paragraph mark at the end of
all those sentences in your example, you could temporarily make all
characters in your document the hidden font style with find and replace
except for those characters that you leave exposed to sort. Then remove the
hidden font attribute from the whole document when the paragraph sorting is
done.
Post by Chris Gianetto
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.
ITEM ID. URO 001
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
--
Russ

drsmN0SPAMikleAThotmailD0Tcom.INVALID
Russ
2006-08-06 21:04:15 UTC
Permalink
Chris,
This works for me if every sentence in your sample ends with a manual line
break (shift-return), except the last sentence, which should end with a
paragraph mark (return).

Public Sub Sort_UnHidden()
Dim show_hide As Boolean

show_hide = ActiveWindow.ActivePane.View.ShowAll
ActiveDocument.Range.Font.hidden = True
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Font.hidden = True
.Replacement.Font.hidden = False
.Text = "Classification: {1,}[0-9]{1,}"
.Replacement.Text = "^&"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
End With
Selection.Find.Execute Replace:=wdReplaceAll
ActiveWindow.ActivePane.View.ShowAll = False
ActiveDocument.Range.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldAlphanumeric, SortOrder:=wdSortOrderAscending
ActiveDocument.Range.Font.hidden = False
ActiveWindow.ActivePane.View.ShowAll = show_hide
End Sub
Post by Helmut Weber
Chris,
If that is truly one paragraph with only one paragraph mark at the end of
all those sentences in your example, you could temporarily make all
characters in your document the hidden font style with find and replace
except for those characters that you leave exposed to sort. Then remove the
hidden font attribute from the whole document when the paragraph sorting is
done.
Post by Chris Gianetto
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.
ITEM ID. URO 001
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
--
Russ

drsmN0SPAMikleAThotmailD0Tcom.INVALID
Russ
2006-08-06 22:54:34 UTC
Permalink
Chris,
Here a more reliable revision of my suggested subroutine so that the sort
will focus only on the numbers and sort numerically rather than
alphabetically. Any blank lines between paragraphs will float to the top,
but you can get rid of those and format all paragraphs with "spacing-before"
through Format|Paragraph... menu or as part of the macro. Paragraphs should
be separated distance-wise with that type of formatting and not blank lines
when done in MS Word.


Public Sub Sort_UnHidden()
Dim show_hide As Boolean

show_hide = ActiveWindow.ActivePane.View.ShowAll
ActiveWindow.ActivePane.View.ShowAll = True
ActiveDocument.Range.Font.hidden = True
Selection.Find.ClearFormatting
Selection.Find.Replacement.ClearFormatting
With Selection.Find
.Font.hidden = True
.Text = "Classification: {1,}[0-9]{1,}"
.Replacement.Text = "^&"
.Forward = True
.Wrap = wdFindContinue
.Format = True
.MatchCase = False
.MatchWholeWord = False
.MatchWildcards = True
.MatchSoundsLike = False
.MatchAllWordForms = False
While .Execute
Selection.MoveStartUntil Cset:="0123456789", Count:=wdForward
Selection.Font.hidden = False
Selection.End = ActiveDocument.Range.End
Wend
End With
ActiveWindow.ActivePane.View.ShowAll = False
ActiveDocument.Range.Sort ExcludeHeader:=False, FieldNumber:="Paragraphs", _
SortFieldType:=wdSortFieldNumeric, SortOrder:=wdSortOrderAscending
ActiveDocument.Range.Font.hidden = False
ActiveWindow.ActivePane.View.ShowAll = show_hide
End Sub
Post by Chris Gianetto
I'm presently trying to create a macro that will sort paragraphs of text.
Below is an example of one of the paragraphs. I'm trying to sort them with
reference to the "Classification:" number. The number of words between the
header of the paragraph and the classification header is inconsistent. The
end result should maintain the paragraph formatting only altering the
position of the paragraphs placing them into an ascending order.
ITEM ID. URO 001
1. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
*2. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
3. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
4. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Classification: 08
Keywords: Text. Text Text text.
Ref. Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text. Sentence of text. Sentence of text. Sentence of text.
Sentence of text.
--
Russ

drsmN0SPAMikleAThotmailD0Tcom.INVALID
Continue reading on narkive:
Loading...