- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - Linux
- >
- Re: How to automate opening Webpage and copying th...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2008 04:51 AM
05-05-2008 04:51 AM
Here i wanted to automate opening the web page and copy the contents into a file in txt format.Please let me know how to do in linux.
Regards,
BS
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2008 05:20 AM
05-05-2008 05:20 AM
Solution- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2008 05:35 AM
05-05-2008 05:35 AM
			
				
					
						
							Re: How to automate opening Webpage and copying the contents into a file in the text format using perl
						
					
					
				
			
		
	
			
	
	
	
	
	
try the solution as proposed by Paul above - wget
followed by one "simple-minded" approach, that works for most files:
#!/usr/bin/perl -p0777
s/<(?:[^>'"]*|(['"]).*?\1)*>//gs
hope this helps!
kind regards
yogeeraj
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2008 06:05 AM
05-05-2008 06:05 AM
			
				
					
						
							Re: How to automate opening Webpage and copying the contents into a file in the text format using perl
						
					
					
				
			
		
	
			
	
	
	
	
	
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2008 11:56 AM
05-05-2008 11:56 AM
			
				
					
						
							Re: How to automate opening Webpage and copying the contents into a file in the text format using perl
						
					
					
				
			
		
	
			
	
	
	
	
	
if the web page needs authentication with username and password then how to takecare of that.
Regards,
BS
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-05-2008 02:09 PM
05-05-2008 02:09 PM
			
				
					
						
							Re: How to automate opening Webpage and copying the contents into a file in the text format using perl
						
					
					
				
			
		
	
			
	
	
	
	
	
wget -h
Look for options like:
[...]
HTTP options:
--http-user=USER set http user to USER.
--http-password=PASS set http password to PASS.
[...]
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
05-06-2008 12:58 PM
05-06-2008 12:58 PM
			
				
					
						
							Re: How to automate opening Webpage and copying the contents into a file in the text format using perl
						
					
					
				
			
		
	
			
	
	
	
	
	
If your objective is only to snapshot a webpage and copy it to a file in text format, 'wget' is probably the simplest way.
http://hpux.cs.utah.edu/hppd/hpux/Gnu/wget-1.11.1/
The LWP::Simple module from CPAN provides the 'getstore' function to do the same:
http://search.cpan.org/~gaas/libwww-perl-5.812/lib/LWP/Simple.pm
Have a look, too, at:
http://search.cpan.org/~gaas/libwww-perl-5.812/lwpcook.pod
If you are serious about parsing HTML, though, I suggest you look at the HTML::TreeBuilder module beginning with:
http://search.cpan.org/~petek/HTML-Tree-3.23/lib/HTML/Tree.pm
and:
http://search.cpan.org/~petek/HTML-Tree-3.23/lib/HTML/TreeBuilder.pm
Regards!
...JRF...
