My SQLite challenge

Started by spl, June 19, 2024, 02:44:14 PM

Previous topic - Next topic

spl

The attached zip has the same data in multiple files differentiated by delimiter [ comma, tab, caret, bar, semi-colon].

I was working on a script, using the .NET SQLite dll as my version does not support the Extender which would
  • Choose and load one of the files into a .NET datatable figuring out the delimiter
  • create an SQLite db and table based on column names (NOT a hard-coded Create SQL)

and then insert the text rows into the SQLite table and review the data with the SQLite browser.

I tried this in both WB and Powershell - determining the delimiter, creating a schema.ini for the format, including the format in a connection string using the ACE.OLEDB.16 Provider: but both created a table with a single column w/out separating columns by delimiter.

I did find a simple, yet elegant work around, but not in WB code. So could code be displayed using the Extender to create both a db and table based on the text file selected?
Stan - formerly stanl [ex-Pundit]

JTaylor

Not sure if this are elegant or not and it could use some more work but here is one option.  If I used my SQLite Extender I could eliminate about 15 lines of code since I have an InsertFromCSV() function there.

 
AddExtender("ilcsl44i.dll")

  #DefineFunction Create_Table(db,tname,header,data)
     SQLText = "Create table IF NOT EXISTS ":tname:" ("
     cntr = 1
     Foreach col in header
       type = "TEXT"
       If IsNumber(data[cntr-1]) Then type = "NUMERIC"
       SQLText := col:" ":type:",":@CRLF
       cntr += 1
     Next
     SQLText = ItemRemove(-1,SQLText,","):");":@CRLF
     slExecute(db,SQLText)
  #EndFunction


  db = slConnect("challengers.sqlite")
  delimiters = "^|,;":@TAB
  darr = ArrayFromStr(delimiters)
  farr = Arrayize(FileItemize("*.csv"),@TAB)
   

   
  ForEach file in farr
   tname = ItemRemove(-1,file,".")
   ForEach delimiter in darr
     or = FileOpen(file,"READ")
     ftxt = FileRead(or)
     fdat = FileRead(or)
     If ItemCount(ftxt,delimiter) > 8 Then
       Create_Table(db,tname,Arrayize(ftxt,delimiter),Arrayize(fdat,delimiter))
       SQL_All = "Begin Transaction;":@CRLF
       cntr = 0
       While ftxt != "*EOF*"
         ftxt = FileRead(or)
         If ftxt == "*EOF*" Then Break
         SQLText = "Insert into ":tname:" values ('":StrReplace(ftxt,delimiter,"','"):"');":@CRLF
         SQL_All := SQLText
         If cntr MOD 50 == 0;
           slExecute(db,SQL_All:"Commit;")
           SQL_All = "Begin Transaction;":@CRLF
           cntr = 0
         EndIf
         cntr += 1
       EndWhile
       If SQL_All != "Begin Transaction;":@CRLF Then slExecute(db,SQL_All:"Commit;")
     EndIf
     FileClose(or)
   Next
 Next

slClose(db)


spl

Very slick, nice work. It dawned on me what was wrong with my script. I was creating the schema.ini in dirscript() rather than in the folder where the .csv was located. Also, by moving the .csv => .NET Datatable => SQLite all columns are System.String [but that is OK]. I use Askfilename() with multiple if statements to select delimiter used in building the connection string. I attached a .db based on the _caret csv file.

Connect string and select [for reference].
conn = 'Provider=Microsoft.ACE.OLEDB.16.0;Data Source=C:\PSFiles\ACE\;Extended Properties="Text;HDR=YES;FMT=Delimited(^)Imex=1";'
sql = "SELECT * FROM [foodsales_caret.csv];"
Stan - formerly stanl [ex-Pundit]

JTaylor

Thanks.  Realized I had left something out so edited my post.

Jim

cssyphus

Very nice Jim, very slick indeed. I will be implementing that code also. Much good thought there.

And now I see what you and Stan meant about writing the insert as a transaction - many, many thanks for that demo.

Hopefully your idea re the InsertFromCsv() function will find its way onto the Official Wish List for the SQLite extender. :)

JTaylor

Would be nice.   In the meantime, no reason you can't Add both extenders and use that function, when needed.

Jim

spl

Jim;

I liked your idea of an array for delimiters, but thought I needed to account for quoted csv files, so used a series of if's. I generally design WB to be compiled and sent out for other's to use so I code more for what can go wrong rather than go right. I inserted my test script code, using the .NET SQLite dll and I an inclined to use .NET data tables whenever as they are quite versatile. Now just have to look into handling fixed length text files or possible delimiters like ? or !...

On another note, I tested with a full foodsales text file, over 90k rows and the transaction took less than 8 seconds.
;Winbatch 2022C - prepare .Net DataTble from .csv and inset into SQLite
;Stan Littlefield, June 20, 2024
;
;NOTE: tested as 64bit WB and used 64bit SQLite .NET dll                       
;///////////////////////////////////////////////////////////////////////////////////////////////////////////////
gosub udfs
IntControl(73,1,0,0,0)
;look for text or csv file to process
types="Csv Files|*.csv|Text Files|*.txt|"
cFile=AskFilename("SelectFile", "C:\WinBatch", types, "Sol.wbt", 1)
rLen = strlen(FileBaseName(cFile))
folder = strsub(cFile,1,strlen(cFile)-rlen)
tbl = FileBaseName(cFile)
db= FileRoot(cFile)
;/////////////////////////////////////////////////////////////////////////////////////
;create schema.ini
cSchema="[":tbl:"]":@CRLF
h=FileOpen(cFile,"READ")
ftxt = FileRead(h)
fmt = "CSVDelimited"  ;default
str = StrIndex(ftxt, '","', 0, @FWDSCAN)
If str ==0 ; account for quoted csv and keep default
   If ItemCount(ftxt,@TAB) > 3
      fmt = "TabDelimited"
   Endif
   If ItemCount(ftxt,"|") > 3
      fmt = "Delimited(|)"
   Endif
   If ItemCount(ftxt,"^") > 3
      fmt = "Delimited(^)"
   Endif
   If ItemCount(ftxt,";") > 3
      fmt = "Delimited(;)"
   Endif
   If ItemCount(ftxt,",") > 3
      fmt = "CSVDelimited""
   Endif
Endif
FileClose(h)

cSchema=cSchema:"Format=%fmt%":@CRLF
cSchema=cSchema:"ColNameHeader=True":@CRLF
cSchema=cSchema:"MaxScanRows=10":@CRLF
cSchema=cSchema:"CharacterSet=ANSI":@CRLF
cINI=folder:"schema.ini"
If FileExist(cINI) Then FileDelete(cINI)
FilePut(cINI,cSchema)

;create connection string and sqlite db
cConn='Provider=Microsoft.ACE.OLEDB.16.0;Data Source=%folder%':';Extended Properties="TEXT;FMT=%fmt%HDR=YES;IMEX=1"'

sqlite =  dirscript():"%db%.db"
If FileExist(sqlite) Then FileDelete(sqlite)

;/////////////////////////////////////////////////////////////////////////////////////
;here is where you would substitute the WB SQLite Extender
base = dirscript():"\64bit"  ;folder where  System.Data.SQLite.dll located
ObjectClrOption("AppBase", base)
ObjectClrOption("use","System.Data.SQLite")
;/////////////////////////////////////////////////////////////////////////////////////

;use .NET DataTable
ObjectClrOption("useany","System.Data")
oConn = ObjectClrNew( 'System.Data.OleDb.OleDbConnection',cConn)
oConn.Open()
oCmd = ObjectClrNew( 'System.Data.OleDb.OleDbCommand')
cSQL = "SELECT * FROM [%tbl%];"
oCmd.Connection = oConn
oCmd.CommandText = cSQL

oAdapter = ObjectClrNew( 'System.Data.OleDb.OleDbDataAdapter')
oTable = ObjectClrNew( 'System.Data.DataTable')                 
oAdapter.SelectCommand = oCmd
rowcount = oAdapter.Fill(oTable)
colcount = oTable.Columns.Count
oRows = oTable.Rows
oCols = oTable.Columns
cCreate = "Create Table %db% (":@LF
insert = "INSERT INTO %db% ("
ForEach col in oCols
   name = "`":col.ColumnName:"`"
   colType = col.DataType.ToString()
   colType = strreplace(colType,"System.","")
   objRow = oTable.Rows.Item(0)
   val = objRow.Item(col)
   cCreate = cCreate:name:" ":colType:",":@LF
   insert = insert:name:","
   ;cCreate = cCreate:name:" ":cvt(colType):",":"[VALUE: %val% .net type %colType%]":@LF
Next
cCreate = Strsub(cCreate,1,strlen(cCreate)-2):@LF:");"
insert = Strsub(insert,1,strlen(insert)-1):") ":@LF:"VALUES ":@LF

;try to create sqlite table
Display(2,"Creating ":sqlite,"Begin data load")
cn= ObjectClrNew("System.Data.SQLite.SQLiteConnection")
cn.ConnectionString = "Data Source=":sqlite
cm = objectClrNew("System.Data.SQLite.SQLiteCommand")
cm.Connection  = cn
cn.Open()
cm.CommandText = cCreate
cm.ExecuteNonQuery()
trans = cn.BeginTransaction()
Foreach r in oTable.Rows
   insert1 = insert:"("
   Foreach col in oTable.Columns
      insert1 = insert1:"'":r.Item(col):"'":","
   Next
   insert1 = Strsub(insert1,1,strlen(insert1)-1):")"
   cm.CommandText = insert1
   cm.ExecuteNonQuery()
Next
trans.Commit()
oAdapter.Dispose()
oTable=0
oCmd=0
oConn=0
cm=0
cn.Close()
Display(2,"Inserted  ":sqlite,"Data Entered into Table")
Exit

:WBERRORHANDLER
geterror()
Message("Error Encountered",errmsg)
Exit
;/////////////////////////////////////////////////////////////////////////////////////

:udfs
#DefineSubRoutine geterror()
   wberroradditionalinfo = wberrorarray[6]
   lasterr = wberrorarray[0]
   handlerline = wberrorarray[1]
   textstring = wberrorarray[5]
   linenumber = wberrorarray[8]
   errmsg = "Error: ":lasterr:@LF:textstring:@LF:"Line (":linenumber:")":@LF:wberroradditionalinfo
   Return(errmsg)
#EndSubRoutine
Return
;/////////////////////////////////////////////////////////////////////////////////////

Stan - formerly stanl [ex-Pundit]

td

Quote from: JTaylor on June 19, 2024, 07:58:31 PMNot sure if this are elegant or not and it could use some more work but here is one option.  If I used my SQLite Extender I could eliminate about 15 lines of code since I have an InsertFromCSV() function there.

The SQLite extender has the official SQLite CSV extension built into it. There is an example in the help file. Of course, there is the slStmExecMany function which will take an array returned from ArrayFileGetCsv in a single step.

"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

JTaylor

Excellent.  I had looked at some of the functions to see if there as an option but missed that part.  Sounds like I may be able to fully deprecate my Extender.  I had kept it on the list for things like this.  Thanks.

jim

Quote from: td on June 20, 2024, 08:25:22 AMThe SQLite extender has the official SQLite CSV extension built into it. There is an example in the help file. Of course, there is the slStmExecMany function which will take an array returned from ArrayFileGetCsv in a single step.



JTaylor

Here is an adjusted version.   Can you see why it is so slow?  It takes about 4 seconds.


  AddExtender("ilcsl44i.dll")
  #DefineFunction Create_Table(db,tname,header,data)
     SQLText = "Create table IF NOT EXISTS ":tname:" ("
     SQLIns  = "INSERT INTO ":tname:" Values ("
     cntr = 1
     Foreach col in header
       type = "TEXT"
       If IsNumber(data[cntr-1]) Then type = "NUMERIC"
       SQLText := col:" ":type:",":@CRLF
       SQLIns  := "?%cntr%,"
       cntr += 1
     Next
     SQLText = ItemRemove(-1,SQLText,","):");":@CRLF
     SQLIns  = ItemRemove(-1,SQLIns,","):");":@CRLF
     slExecute(db,SQLText)
     Return SQLIns
  #EndFunction


  db = slConnect("challengers.sqlite")
  delimiters = "^|,;":@TAB
  darr = ArrayFromStr(delimiters)
  farr = Arrayize(FileItemize("*.csv"),@TAB)
   
  ForEach file in farr
   tname = ItemRemove(-1,file,".")
   ForEach delimiter in darr
     or = FileOpen(file,"READ")
     ftxt = FileRead(or)
     fdat = FileRead(or)
     FileClose(or)
     If ItemCount(ftxt,delimiter) > 8 Then
       SQLIns = Create_Table(db,tname,Arrayize(ftxt,delimiter),Arrayize(fdat,delimiter))
       hStm = slStatement(db,SQLIns)
       datarr = ArrayFileGetCSV(file,0,delimiter)
       nResult = slStmExecMany(hStm, datarr, 1, -1)
     EndIf
   Next
 Next
slClose(db)


spl

Quote from: JTaylor on June 20, 2024, 09:33:46 AMHere is an adjusted version.   Can you see why it is so slow?  It takes about 4 seconds.

Again, Nice! Figured I'd just mess things up. Below is a very quick and efficient PS script to perform the CSV=>SQlite. Place the code into the 64-bit ISE and run. It works with a small SQLite module which the code should download if not installed. Still uses a DataTable but not a lot of messing around with create table sql or bulk copy.
<#
   Test script to validate importing .csv/.txt file into .NET DataTable
   then import into SQLite table.
   Script set to determine delimiter types: comma [,], bar [|], caret [^], semi-colon [;], Tab
   Due to the Import-CSV method, columns are not typed, and would be set as System.String
   The SQLiteBulkCopy method includes a confirmation pop-up

   Found this preferable to building the DataTable from ADO .NET - as that oftem failed to determine delimiter(s)
#>
########################################################################################################################
$Popup = New-Object -ComObject wscript.shell

#look for PSSQlite module and attenpt to load it if not available
if (Get-Module -ListAvailable -Name PSSQlite) {
    $Popup.popup("PSSQLite Module",2,"Installed",4096)
}
else {
    try {
        Install-Module -Name PSSQlite -AllowClobber -Confirm:$False -Force 
    }
    catch [Exception] {
        $Popup.popup($_.message,2,"Error",4096)       
        exit
    }
}

Import-Module PSSQLite
########################################################################################################################

#select text file
$FileBrowser = New-Object System.Windows.Forms.OpenFileDialog -Property @{
    InitialDirectory =  $PSScriptRoot
    Filter = 'Text Files (*.csv;*.txt)|*.csv;*.txt'
    Title = 'Select Text or Csv File To Process'
}
$result = $FileBrowser.ShowDialog()
if ($result -eq 'Cancel') { Exit }
########################################################################################################################

#file has been selected
$cFile = $FileBrowser.FileName
$tbl = [System.IO.Path]::GetFileNameWithoutExtension("$cFile")

#determine appropriate delimiter
$line = (Get-Content -First 1 $cFile)
$delim = ","
if ($line -match "`t") { $delim = "`t" }
elseif ($line -like '*,"*') { $delim = "," }
elseif ($line -like "*,*") { $delim = "," }
elseif ($line -like "*^*") { $delim = "^" }
elseif ($line -like "*|*") { $delim = "|" }
elseif ($line -like "*;*") { $delim = ";" }
elseif ($line -like "*?*") { $delim = "?" }
elseif ($line -like "*!*") { $delim = "!" }
elseif ($line -like "*;*") { $delim = ";" }

#load text file into .NET DataTable, also set up SQLite db
$Database = "$PSScriptRoot\$tbl.db"
$Popup.popup("Creating database $Database",2,"Please Wait",4096)
$Datatable = New-Object System.Data.DataTable
## By Default Import-csv uses first row as Header
$csvall = Import-Csv -Path $cFile -Delimiter $delim
$csvheaders = $csvall | Get-member -MemberType NoteProperty
#if SQLite db exists, remove it [if testing script multiple times]
If (Test-Path $Database) { Remove-Item $Database }

Foreach ($header in $csvheaders) {   
    $Datatable.Columns.Add($header.Name)   
}

#Data Columns set, now insert rows of data into Datatable
$Popup.popup("Adding rows to $Datatable",2,"Please Wait",4096)
Foreach($csv in $csvall) {   
   $row = $Datatable.NewRow()
   Foreach ($col in $Datatable.Columns)
   {
      $c = $col.ColumnName
      $row.$c = $csv.$c
   }
   $Datatable.Rows.Add($row)
}

$Popup.popup("Creating SQLite table $tbl",2,"Please Wait",4096)
$create = "Create Table $tbl ( `n"
$cr = ""
Foreach ($col in $Datatable.Columns)
{
   $name = '`'+$col.ColumnName+'`'
   $colType = $col.DataType.ToString()
   $colType = $colType.Replace('System.','')
   $cr += "$name $colType,`n"
}
$cr = $cr.Substring(0, $cr.Length - 2)
$create = $create + $cr + ")"
Invoke-SqliteQuery -Query $create -DataSource $Database
#Table structure for Datatable has been created, now bulk insert row data, w/confirmation pop-up
########################################################################################################################
$Popup.popup("Inserting Rows `n you will be promted to confirm",3,"Please Wait",4096)
Invoke-SQLiteBulkCopy -DataTable $DataTable -DataSource $Database -Table $tbl
$DataTable = $null
$csvall  = $null
$Popup.popup("Depending on your confirmation selection `n either the table is constructed or blank.",3,"Closing",4096)
Exit
########################################################################################################################
Stan - formerly stanl [ex-Pundit]

td

Quote from: JTaylor on June 20, 2024, 09:33:46 AMHere is an adjusted version.   Can you see why it is so slow?  It takes about 4 seconds.

A nice bit of scripting. On my system, the script runs in about 1.6 seconds. It is marginally faster than that when using 64-bit WinBatch. The different timing between our systems is likely related to hardware differences.

The majority of time is spent in the slStmExecMany function. That surprises me a little. The function does a lot of anomaly checking so there may be room for optimization.  I will look into it as time permits.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

I decided to try the SQLite extender's CSV extension using the following. It executes in ~0.0 seconds on my over-the-top workstation. Your mileage may vary.

AddExtender("ilcsl44i.dll", 0, "ilcsl64i.dll")

#DefineFunction Standardize(_file, _adelims)
   ret = 0
   Contents = FileGet(_file)
   Line2 = ItemExtract(2, Contents, @lf)
   ForEach Delim in _adelims
      If ItemCount(Line2,Delim) > 8 
         Contents = StrReplace(Contents,Delim, ",")
         FilePut(_file, Contents)
         ret = 1
         break
       endif
   next
   return ret
#EndFunction

db = slConnect("csvextention.sql")
aDelims = ArrayFromStr("^|,;":@TAB)
aFiles = Arrayize(FileItemize("*.csv"),@TAB)
   
ForEach Cvsfile in aFiles
   TabName = ItemRemove(-1,Cvsfile,".")
   Standardize(Cvsfile, aDelims)
   SQL = "CREATE VIRTUAL TABLE ":TabName:" USING csv(filename=""":Cvsfile:""",header=TRUE);"
   slExecute(db, SQL, @SlNoOutput )
Next

The script is rough and permanently modifies the input files. Of course, file modification can be addressed by saving each standardized file's contents to a file with a different name. The one downside to the SQLite CSV extension is that you need to keep both the small generated database and the standardized files around for lookups to work.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

spl

Quote from: td on June 21, 2024, 08:25:37 AMI decided to try the SQLite extender's CSV extension using

Wow nice. One more reason to upgrade from 2022C. Seems the Devil's advocate and Plan-B's I've thrown out have a positive effect.

One more challenge. Either

this is from Go programmers who asked me, how would you convert json=>sqlite. I think there is about 417 rows as a json array. My plan-B would be convert json => csv => load into sqlite. Be nice to see WB solution using both the Json and SQLite extenders. Basically a 'who cares' effort but for the public audience another example of the power of scripting. The attached zip has both raw json and a converted sqlite db.

C'mon it's the weekend and hotter than hades.


Stan - formerly stanl [ex-Pundit]

JTaylor

SQLite has very good JSON capabilities built in.   Shouldn't need anything else.

Jim

spl

Quote from: JTaylor on June 21, 2024, 11:31:07 AMSQLite has very good JSON capabilities built in.   Shouldn't need anything else.

Jim

I am aware of that, but most seem to treat json as a single column/object - I was talking about parsing the individual json attributes into their own column. Maybe one of those functions does that, and if so. Great. My issue, working with Json API returns was to extract the 'parent' for Json arrays. In the case of what I posted, the parent=staker, so    I would be looking for a simple [pseudo=code]  Foreach staker in json_data .... Foreach attribute in staker... group attributes and insert into SQlite table. (all that assuming a json array with a parent)
Stan - formerly stanl [ex-Pundit]

JTaylor

When I get time I will take a look.  On the clock now.   I was talking about parsing as well.

Jim

JTaylor

This should do it.


AddExtender("ilcsl44i.dll")

SQLText = `WITH json_data AS ( SELECT '`:FileGet("stakers.json"):`' AS jsons )`:@CRLF
SQLText := `INSERT INTO stakers `:@CRLF
SQLText := `SELECT`:@CRLF
    SQLText := `json_extract(value, '$.avaxAssigned') AS avaxAssigned,`:@CRLF
    SQLText := `json_extract(value, '$.avaxStaked') AS avaxStaked,`:@CRLF
    SQLText := `json_extract(value, '$.avaxValidating') AS avaxValidating,`:@CRLF
    SQLText := `json_extract(value, '$.avaxValidatingHighWater') AS avaxValidatingHighWater,`:@CRLF
    SQLText := `json_extract(value, '$.ggpLockedUntil') AS ggpLockedUntil,`:@CRLF
    SQLText := `json_extract(value, '$.ggpRewards') AS ggpRewards,`:@CRLF
    SQLText := `json_extract(value, '$.ggpStaked') AS ggpStaked,`:@CRLF
    SQLText := `json_extract(value, '$.index') AS "index",`:@CRLF
    SQLText := `json_extract(value, '$.lastRewardsCycleCompleted') AS lastRewardsCycleCompleted,`:@CRLF
    SQLText := `json_extract(value, '$.rewardsStartTime') AS rewardsStartTime,`:@CRLF
    SQLText := `json_extract(value, '$.stakerAddr') AS stakerAddr`:@CRLF
SQLText := `FROM json_data, json_each(json_extract(jsons, '$.stakers'));`:@CRLF

  db = slConnect("stakers.sqlite")
  slExecute(db,SQLText)

spl

Nice, but hard-coded columns. Need a more generalized script to insert any given json.
Stan - formerly stanl [ex-Pundit]

JTaylor

Are we assuming a single level or are you wanting every sublevel to be a different table?

Jim

JTaylor

This, obviously, makes some assumptions but does the job. 


AddExtender("ilcsl44i.dll")
jfname = "stakers.json"

db = slConnect("stakers.sqlite")

SQLText = `create table if not exists json_data (jsond TEXT);`
tarr    = slExecute(db,SQLText)

SQLText = `insert into json_data values ('`:FileGet(jfname):`');`:@CRLF
tarr    = slExecute(db,SQLText)

SQLText = `select json_each.key AS key, json_each.type AS type FROM json_data, json_each(json_extract(jsond, '$'));`
tarr    = slExecute(db,SQLText)

tname = "mytable"
If tarr[0,0] != "" && tarr[0,1] == "array" Then tname = tarr[0,0]

SQLText = `drop table if exists `:tname:`;`:@CRLF
tarr    = slExecute(db,SQLText)

SQLText = `select distinct json_each.key AS key, json_each.type AS type FROM json_data, json_each(json_extract(jsond, '$.%tname%[0]'));`:@CRLF
tarr    = slExecute(db,SQLText)

sfields = @CRLF:"  select "
SQLText = `create table `:tname:` (`
For x = 0 to ArrInfo(tarr,1)-1
  SQLText := `"`:tarr[x,0]:`" `:tarr[x,1]:", "
  sfields := `json_extract(value, '$.`:tarr[x,0]:`'), `
Next
SQLText = ItemRemove(-1,SQLText,",")
sfields = ItemRemove(-1,sfields,","):` from json_data, json_each(json_extract(jsond, '$.`:tname:`'));`

SQLText := ");"
tarr    = slExecute(db,SQLText)

SQLText = `insert into `:tname:sfields
tarr    = slExecute(db,SQLText)

SQLText = `drop table if exists json_data;`
tarr    = slExecute(db,SQLText)

slClose(db)


spl

Again Nice, but does make the assumption that 'stakers' is the parent, i.e. that the resulting table would be  an iteration of each staker 'record'.  Actually I ran against a problem with identifying a parent - the script logic being focused on json returns from an API, assuming well-formed json and records associate with the parent, in this case stakers.

I found that
  • if json derived directly from API, parent 'name' could be extracted from result
  • if json saved to file, then loaded, code to find parent would not work, as now have to treat json object as separate from just a return

Easy to ultimately code. Now as to nested json arrays... that would probably take some practice.

Stan - formerly stanl [ex-Pundit]

JTaylor

That one reference was an oversight.   The name of the file and db had to be something.  I don't think I can write something that would figure out what source you wanted to use on any particular day.

I also don't think anyone can write something that could handle every single situation.  I am guessing situations where the fields in one record area different than the fields in another would cause a problem.

As I said, I had to make some assumptions.  I just used the root level name as the tname.  If the source doesn't have one this code will fail.  My main goal was to demonstrate that what you are asking could be done with sqlite and without resorting to having to turn the json into some other format.

Jim

spl

Quote from: JTaylor on June 22, 2024, 11:39:03 AMI also don't think anyone can write something that could handle every single situation.  I am guessing situations where the fields in one record area different than the fields in another would cause a problem.


Couldn't agree more. My caveat was well-formed json data from API, whether a direct result from a REST query or loading a persisted .json file from said query. The focus is on (1) identifying a parent to be used to iterate through child attributes/values as rows, and converting to valid csv/excel makes SQLite or any other db table a nice to have. I used SQLite as that is a focus of this thread.

Obviously a more detailed conversation with the concept can be handled off-line or on another thread. And as for dealing with nested attributes [for API results] the generalized code would serve as a template for a script specific to the peculiar json result. 
Stan - formerly stanl [ex-Pundit]

JTaylor

What I meant on the fields differing is the reason I keep pestering Tony to allow for a default value from the jsGetValue() function in the json extender.  It isn't malformed JSON, as such.   Practically every json feed I get has some fields in one record but not in others.   Mostly book stuff so one might have a Series title but another not.  Instead of having Series field in all the records and leaving it blank when needed, they just leave out the field altogether.

After thinking about it, with a minor tweak, the code up above could handle not having a root name.

Jim

spl

Quote from: JTaylor on June 22, 2024, 04:17:52 PMIt isn't malformed JSON, as such.

I remember you bringing up that issue. Maybe not malformed, but certainly poorly written output if coming from same source. One one assume all attributes are covered and any missing values would still include attribute name as null. Just .02
Stan - formerly stanl [ex-Pundit]

JTaylor

Don't disagree.  I am not sure I have ever used a JSON feed that had consistent fields.  I have seen them but of all the ones I work with, and there are several, I can't think of a single one that is consistent.  That is why when I wrote my JSON extender I included the option for a default value rather than have to overcome the errors all the time.  Still hoping to see that minor change show up in the new extender :-)

Jim

td

Quote from: td on June 20, 2024, 02:19:35 PM
Quote from: JTaylor on June 20, 2024, 09:33:46 AMHere is an adjusted version.  Can you see why it is so slow?  It takes about 4 seconds.

A nice bit of scripting. On my system, the script runs in about 1.6 seconds. It is marginally faster than that when using 64-bit WinBatch. The different timing between our systems is likely related to hardware differences.

The majority of time is spent in the slStmExecMany function. That surprises me a little. The function does a lot of anomaly checking so there may be room for optimization.  I will look into it as time permits.

A footnote.

If the "nResult = slStmExecMany(hStm, datarr, 1, -1)" line in Jim's original script is written instead like the following:

       slExecute(db, "BEGIN TRANSACTION")
       nResult = slStmExecMany(hStm, datarr, 1, -1)
       slExecute(db, "COMMIT")

Jim's script runs in ~.061 seconds instead of ~1.6 seconds on my system.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

JTaylor

That does resolve the issue but seems like for such a small amount of data it shouldn't take that long anyway. 

That also affects that large file I tried loading.  It simply runs out of memory in a few seconds rather than the 8-10 hours like before.

If there a reason for the extender to not use TRANSACTIONS as part of it operation?  Would that be a bad assumption to make?

Thanks.

Jim

td

Your memory issue is, hopefully, addressed in the latest release of the extender.

There are both philosophical, design, and practical reasons for not "automagically" including the "BEGIN TRANSACTION" and "COMMIT" statements inside the slStmExecMany function.

"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

td

Also, a profiler indicates that all the slowness in your challenge script is in the SQLite library and not the extender's code and the reason the extender has the SQLite extension compiled into to provide support for large CSV files.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade

spl

Quote from: JTaylor on June 23, 2024, 12:04:49 PMDon't disagree.  I am not sure I have ever used a JSON feed that had consistent fields.
Jim

And I spoke too broadly.  Browse this

https://gist.githubusercontent.com/sanderstad/1c47c1add7476945857bff4d8dc2be59/raw/d12f30e4aaf9d2ee18e4539b394a12e63dea0c9c/SampleJSON1.json

The color white will be missing the 'type' attribute in the returned json, but I'm still able to parse the json into Excel or a db and it just dislays as an empty value. Not the same as 'inconsistent' fields. Maybe you can send me a sample through private email so I can experience the failure.
Stan - formerly stanl [ex-Pundit]

JTaylor

The issue with Large files is resolved.   Only took 6-7 seconds for 660,000+ records.

Also reminded me I needed to update.  Didn't realize it had been that long.

Thanks for the quick action.

Jim

td

Thanks for reporting your result. It appeared to be fixed but confirmation erased remaining doubts.

I have set up a 500,000 recorded, 85MB CSV file as a benchmark for future releases.
"No one who sees a peregrine falcon fly can ever forget the beauty and thrill of that flight."
  - Dr. Tom Cade