- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- script help
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:29 AM
тАО08-31-2007 05:29 AM
the file has 10 lines of the record (f1), I want to break when it reaches 5 and continue with the next line (f2) and so on.
what is the best way to do it?
10 points for the best answer.
f1
f1
f1
f1
f1
f1
f1
f1
f1
f1
fi
f2
f2
f3
f3
f3
f3
f3
f3
f3
f3
f3
f3
f3
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:46 AM
тАО08-31-2007 05:46 AM
Re: script help
file2
do
awk 'NR < 6 {print}' $x
done
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:52 AM
тАО08-31-2007 05:52 AM
Solution#!/usr/bin/sh
typeset -i N=0
typeset -i I=0
typeset -i MAX=5 # max repeats
typeset S=''
typeset INFILE="myinfile"
uniq -c ${INFILE} | while read N S
do
I=1
if [[ ${N} -gt ${MAX} ]]
then
N=${MAX}
fi
while [[ ${I} -le ${N} ]]
do
echo "${S}"
((I += 1))
done
done
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:54 AM
тАО08-31-2007 05:54 AM
Re: script help
With a pure shell script:
# #!/usr/bin/sh
while read LINE
do
[ -z "${SAVE}" ] && SAVE=${LINE}
if [ "${LINE}" = ${SAVE} ]; then
let i=i+1
[ ${i} -ge 5 ] && continue || echo ${LINE}
else
i=0
SAVE=${LINE}
echo ${LINE}
fi
done < file
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:57 AM
тАО08-31-2007 05:57 AM
Re: script help
Oops, that shebank line (line-1) should of course be:
#!/usr/bin/sh
...not:
# #!/usr/bin/sh
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:57 AM
тАО08-31-2007 05:57 AM
Re: script help
Oops, that shebang line (line-1) should of course be:
#!/usr/bin/sh
...not:
# #!/usr/bin/sh
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 07:28 AM
тАО08-31-2007 07:28 AM
Re: script help
yours is good but it coninues to count the lines before it gets to the next string.
Is there away to avoid counting the lines and just skip to the next string?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 07:47 AM
тАО08-31-2007 07:47 AM
Re: script help
Think about what you are asking. Do you know of a "skip to the next (different) string" command? How would you write such a command without reading the intervening data?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 09:10 AM
тАО08-31-2007 09:10 AM
Re: script help
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 03:05 PM
тАО08-31-2007 03:05 PM
Re: script help
You just ask the index sequential file to skip to a record with a key greater than the current key. :-)
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 05:41 PM
тАО08-31-2007 05:41 PM
Re: script help
Would you mind to elaborate?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 06:08 PM
тАО08-31-2007 06:08 PM
Re: script help
#!/usr/bin/sh
typeset -i I=0
typeset -i MAX=5
PREV=""
cat FILE | while read CURR; do
if [ $CURR != $PREV ]; then
I=1
echo $CURR
PREV=CURR
else
if [ $I -le $MAX ] ; then
echo $CURR
I=$I+1
fi
fi
done
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 06:10 PM
тАО08-31-2007 06:10 PM
Re: script help
Of course the file has to have some type of B tree to contain the keys and records and allow these quick searches.
So you would basically do:
1) READ and after 5 matches then:
2) START key > current key
3) repeat at 1)
Since you don't have COBOL, you would have to do what Clay said, skip matching records until you come to a difference.
So unless you have 1000s of records to skip, you should just read and compare.
- Tags:
- COBOL
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО08-31-2007 06:34 PM
тАО08-31-2007 06:34 PM
Re: script help
You forgot to initialize "i". And if you do, you only print 4 of the first group. So why did you have "[ -z "${SAVE}" ]"?
It seems you want SAVE to be empty so you go though the difference code like Victor.
>Victor: PREV=CURR
Typo, you forgot a "$" before CURR.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-01-2007 04:30 AM
тАО09-01-2007 04:30 AM
Re: script help
>Dennis: JRF, You forgot to initialize "i". And if you do, you only print 4 of the first group.
Yes, you're correct - sloppy logic on my part and in fact running with 'sh -x' exposes that. The script should look like:
#!/usr/bin/sh
typeset -i i=0
while read LINE
do
[ -z "${SAVE}" ] && SAVE=${LINE}
if [ "${LINE}" = ${SAVE} ]; then
let i=i+1
[ ${i} -gt 5 ] && continue || echo ${LINE}
else
i=1
SAVE=${LINE}
echo ${LINE}
fi
done < file
ALSO:
Dennis>: In COBOL an Indexed file allows you to find records by a key.
Yes, that's true, but B-trees, and hashes are more germane under the assumption that the file was built with the intention of searches like this question posed. ;-)
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-01-2007 04:44 AM
тАО09-01-2007 04:44 AM
Re: script help
Dennis,
Yes, some lines are over 1000 lines and that is why I kept asking if there is away to skip them. It'll take forever before I get the final result.
Sorry if my question sounds stupid but I'd really appreciate any help.
If the shell can not do it, can Perl do it then?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-01-2007 05:49 AM
тАО09-01-2007 05:49 AM
Re: script help
> some lines are over 1000 lines and that is why I kept asking if there is away to skip them. It'll take forever before I get the final result.
Are you saying that your file is static in its contents but that you repeatedly want to search it?
If that's true then you could build a hash (index) as a separate file. The index (file) would contain the offset of the first record of each "block" of similar data (akin to what your example shows). Using the index file, you find the key you want in the index; read the offset stored there associated with the key; and using that offset, seek() to the correct position in the data file. While a pure shell script can't do this, Perl can.
Regards!
...JRF...
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-01-2007 11:05 AM
тАО09-01-2007 11:05 AM
Re: script help
How many total lines? And you want to visit only the first 5 of each set?
If your data is more dynamic, it would have to be sorted, there you could make that index.
(How does the file get sorted?
Or you could just binary search forward in a C program to your guess where the next group starts.
Or in C++, create a multimap.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-02-2007 05:32 AM
тАО09-02-2007 05:32 AM
Re: script help
For better help, please indicate
- an approximate total record count
- whether records as fixed length (allowing for binary search, or jump aheads'.
- do all bytes of each record contribute to uniqueness?
- what data (counters) do you want to retain as well (records, dups, selected,..?)
If the skip-ahead was really imporant then I would do something like:
After N dups, seek ahead an other N dups.
Start with 4.
Repeat if still dup.
Binary search backwards if when jumped too far.
So within a 10,000 sup series you might read: 1,2,3,4,8,32,64,128,256,512,1024,2048,4096,8192,16384,
12288,10240,9216,9728,9984,10112,10048,10016,10000
So that's a good 25 reads to count 10,000,
and only 2 more for every 2 times as many records.
I guess I'll also have to do the obligatory Perl alternatives! :-)
# perl -ne 'print if $test{$_}++ < 5'
The above does NOT require sorted input.
As written it uses the whole line to indicate uniqueness, but it is readily modified to just use a substring or field.
It will gobble up memory per unique line.
It will be fine for up to 100,000 lines, but might become problematic for millions (of uniques records).
What problem are you really trying to solve?
I looks like the requested task will lose a lot of info but doen not at much value.
Don't you want to know how many there where?
IF SORTED, no memory consumption:
$ perl -ne 'if ($last ne $_){ print "($n)\n" if $n>5; $last=$_; $n=0; print} else {print if $n++ <
5}'
Don't you want at least an indication there where more than 5?
$ perl -ne 'print if (($x=$test{$_}++) < 5); print ":\n" if 6==$x'
Hope this helps some,
Hein van den Heuvel (at gmail dot com)
HvdH Performance Consulting
- Tags:
- Perl
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
тАО09-02-2007 06:23 AM
тАО09-02-2007 06:23 AM
Re: script help
Even if not fixed, you can do fuzzy skips by throwing away the partial record.