- Community Home
- >
- Servers and Operating Systems
- >
- Operating Systems
- >
- Operating System - HP-UX
- >
- Re: Does anyone know of a way to scan a file and i...
Categories
Company
Local Language
Forums
Discussions
Forums
- Data Protection and Retention
- Entry Storage Systems
- Legacy
- Midrange and Enterprise Storage
- Storage Networking
- HPE Nimble Storage
Discussions
Forums
Discussions
Discussions
Discussions
Forums
Discussions
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
- BladeSystem Infrastructure and Application Solutions
- Appliance Servers
- Alpha Servers
- BackOffice Products
- Internet Products
- HPE 9000 and HPE e3000 Servers
- Networking
- Netservers
- Secure OS Software for Linux
- Server Management (Insight Manager 7)
- Windows Server 2003
- Operating System - Tru64 Unix
- ProLiant Deployment and Provisioning
- Linux-Based Community / Regional
- Microsoft System Center Integration
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Discussion Boards
Community
Resources
Forums
Blogs
- Subscribe to RSS Feed
- Mark Topic as New
- Mark Topic as Read
- Float this Topic for Current User
- Bookmark
- Subscribe
- Printer Friendly Page
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:34 AM
07-21-2003 10:34 AM
Solved! Go to Solution.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:35 AM
07-21-2003 10:35 AM
Re: Does anyone know of a way to scan a file and identify the language?
Pete
Pete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:36 AM
07-21-2003 10:36 AM
Re: Does anyone know of a way to scan a file and identify the language?
Compiled binary? Source code? Text?
Pete
Pete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:40 AM
07-21-2003 10:40 AM
Re: Does anyone know of a way to scan a file and identify the language?
Chinese? Spanish?, etc.
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:42 AM
07-21-2003 10:42 AM
Re: Does anyone know of a way to scan a file and identify the language?
Pete
Pete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:43 AM
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:45 AM
07-21-2003 10:45 AM
Re: Does anyone know of a way to scan a file and identify the language?
Pete
Pete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:46 AM
07-21-2003 10:46 AM
Re: Does anyone know of a way to scan a file and identify the language?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:49 AM
07-21-2003 10:49 AM
Re: Does anyone know of a way to scan a file and identify the language?
I think that the header line is critical for scripting. I.E.
for FILE in `ls /usr/local/scripts` ; do
TEST=`cat $FILE|grep ^#|head -1`
case $TEST in
*ksh) SCR_LANG="korn" ;;
*csh) SCR_LANG="c-shell" ;;
*perl*) SCR_LANG="perl" ;;
*sh) SCR_LANG=borne" ;;
*) SCR_LANG="I have no clue" ;;
esac
^^Order is critical, as with *sh first ksh and csh would be considered borne.
Okay, so this part was easy, but now you get to compiled languages. My guess is that your looking for souce C/C++/Fortran/Cobol/etc...?
Been a while since Cobol, so my best guess is to "grep table $FILE", as cobol does not use arrays like other languages. If there is a table defined, it's Cobol.
The rest is very very difficult. Your better to go by extension. Why?
C++, C, and Fortran are very similar. Depending on the code, the same basic functions can look the same between C and C++, similar with fortran and pascal.
This would leave looking at include statements for what language it is.
How many differences between #include and #define are there between C and C++? None.
Regards,
Shannon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 10:55 AM
07-21-2003 10:55 AM
Re: Does anyone know of a way to scan a file and identify the language?
The thing is... you cant reall do this. If you run "strings $FILENAME", it should return embedded ascii in files. However, courier is the same in any language, so you cant tell where it's from.
Shannon
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:01 AM
07-21-2003 11:01 AM
Re: Does anyone know of a way to scan a file and identify the language?
From "man file"
...
file performs a series of tests on each file in an attempt to classify it. If file appears to be an ASCII file, file examines the first 512 bytes and tries to guess its language.
...
If by "language" they mean Spanish, German, etc. I would expect that all available Language sets SW would have to be installed.
If anyone has done this on HP-UX, I would like to know. I tried a brief test, ("Uno, Dos, Tres" -or- "Eins, Swei, Drei"), with no success. "file" only identified as "ascii text".
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:02 AM
07-21-2003 11:02 AM
Re: Does anyone know of a way to scan a file and identify the language?
If it's chinese, then it's more likely "unicoded".
Google.com translates pages.
What is the source of these files?
live free or die
harry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:10 AM
07-21-2003 11:10 AM
Re: Does anyone know of a way to scan a file and identify the language?
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:13 AM
07-21-2003 11:13 AM
Re: Does anyone know of a way to scan a file and identify the language?
live free or die
harry
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:14 AM
07-21-2003 11:14 AM
Re: Does anyone know of a way to scan a file and identify the language?
No, 'file' attempts to identify the computer programming "language" as for instance:
# file mycode.c
...might report:
c program text
For what you want you might 'grep' and count ('wc') words that would, with some degree of reasonable probablity, be associated with a particular human-form language.
Regards!
...JRF..
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:23 AM
07-21-2003 11:23 AM
Re: Does anyone know of a way to scan a file and identify the language?
Pete
Pete
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:35 AM
07-21-2003 11:35 AM
Re: Does anyone know of a way to scan a file and identify the language?
http://search.cpan.org/author/MPIOTR/Lingua-Ident-1.4/
- Mark as New
- Bookmark
- Subscribe
- Mute
- Subscribe to RSS Feed
- Permalink
- Report Inappropriate Content
07-21-2003 11:40 AM
07-21-2003 11:40 AM
Re: Does anyone know of a way to scan a file and identify the language?
I found an old contest at
http://www.bwinf.de/ (sorry, in german)
(Contest 19, question 3)
Unfortunately, there are no source codes in the archive.
Basicly the given solution hints suggest to scan for special characters unique to a language or for so called character-key-sequences or key-words. To avoid the special char-problem inside this forum, here is the complete link, from which you might consider only to use the table in the middle. I did not try to "babelfish" this page to english, because I think the table itself gives you a clue.
http://www.bwinf.de/archiv/bwi19/runde1/l13main.html
Volker