PDA

View Full Version : Ban the BOM



DaveR
10-Aug-2021, 12:54 PM
Anybody got a trick to remove all the BOM markers from a folder of source files? I want to revert the code base to 19.1 to introduce a thing developed in DF20.

Notepad++ seems like a good idea but ...

hmmm: this would be a useful addition to the DFRefactor project...

DaveR
10-Aug-2021, 02:23 PM
Anybody got a trick to remove all the BOM markers from a folder of source files? I want to revert the code base to 19.1 to introduce a thing developed in DF20.

Notepad++ seems like a good idea but ...

hmmm: this would be a useful addition to the DFRefactor project...

hmmm common problem, obviously…

http://www.mind-pioneer.com/services/220_Advanced_search_and_replace.html

d

DaveR
10-Aug-2021, 06:41 PM
I thought it might be a useful unicode-learning moment to try and code the salient bits of this in DF.

read all the filenames in a source folder
- for each, readblock the whole file into a uchar[] then check to see if the first characters are a BOM, if so overwrite them with /// and write the uchar back as a file.

sounds easy but I did my preparation by reading all 5 pages of https://support.dataaccess.com/Forums/showthread.php?66811-Character-and-ASCII-how-they-are-supposed-to-work/page4&p=361686
.............aaaaaaand now I'm going to pour myself a large G&T instead and look again in the morning.

Marco
10-Aug-2021, 09:58 PM
Hi Dave

I think this library could be your friend

https://github.com/DataFlexCode/cdsSeqFileHandler

Cheers
Marco

DaveR
11-Aug-2021, 06:54 AM
Hi Dave

I think this library could be your friend

https://github.com/DataFlexCode/cdsSeqFileHandler

Cheers
Marco

Drat, just got it working too :cool: (well, pretty much, not tested loads).

I thought I was 'at' that meeting but I don't recall this. Thanks Marco, I will get it.

DaveR
11-Aug-2021, 11:35 AM
Ok this works. Plug in your own folders and overwrite characters.


Object oButton1 is a Button
Set Size to 13 60
Set Label to "BanTheBOM"
set psTooltip to 'Overwrite any Byte Order Marks found with'//^'
Set Location to 20 20
Procedure OnCLick
String sPath sPathOut sBuffer
UChar uBOM //da bom
UChar[] uAll uBlank //whole program text
Boolean bExist
Integer iCount iBomdisposal
Move 239 to uBOM //first character is enough
Move "C:\DEVKFP\Kirknet\AppSrc" to sPath
Move "C:\DEVKFP\Kirknet\AppSrcFixed" to sPathOut


File_Exist sPathOut bExist
If (not(bExist)) Make_Directory sPathOut
Direct_Input channel 2 ("dir:"+sPath+"/*.*")
While (not (SeqEof)) //read all files in our working directory
Readln channel 2 sBuffer
If (Left(sBuffer,1)="[" or Left(sBuffer,1)=".") Begin
End
Else Begin //its a source file
Increment iCount
Direct_Input channel 3 (sPath+"\"+sBuffer)
If (sBuffer<>"") Begin
Move uBlank to uAll
Read_Block channel 3 uAll -1
If (sizeofarray(uAll)<>0) Begin
If (uAll[0]=239) Begin
Showln (sPath+"\"+sBuffer) " >> " (sPathOut+"\"+sBuffer)
Direct_Output channel 4 (sPathOut+"\"+sBuffer)
Move 47 to uAll[0] // / is 47
Move 47 to uAll[1]
Move 94 to uAll[2] //94 is ^, copyright symbol is 169 , pick something that it's easy to search for
Write channel 4 uAll
Close_Output channel 4
Increment iBOMDisposal
End
End
Close_Input channel 3
Move False to SeqEof
End
End
Loop
Showln "Completed: Removed BOM from " iBomdisposal " of " iCount " sources"
End_Procedure
End_Object



Subsequently I used the Studio to search and replace //^ with a space. That's less destructive when the original code started with a 'Use.... :p
Probably I should have replaced with spaces originally.

Focus
12-Aug-2021, 08:42 AM
Could you not use RemoveFromArray to remove the elements you don't want before writing it out again ?

Michael Mullan
12-Aug-2021, 09:04 AM
Don't you want to scan the ENTIRE file for non-ascii characters, so that you can flag, or leave untouched all the fun Unicode chars that are actually there?

/MM

DaveR
12-Aug-2021, 09:58 AM
Don't you want to scan the ENTIRE file for non-ascii characters, so that you can flag, or leave untouched all the fun Unicode chars that are actually there?

/MM

version 2... :cool:

this was specifically to unBOM all the sources touched in DF20 without the OEM flag set. We've still got things that we'll complete in 19.1 while we are rolling out 20.