New Stumbler with some questions about the Wigle project!

The gear needed for wardriving

6 posts • Page 1 of 1
Hello!

First post!

Questions!

I see Wigle's API is pretty limited to prevent scraping, this makes sense, however, is it also limited if I want to query my own uploads?
--> If so, any plans for a credit system to the wigle API?
------> If so, have you put thought to awarding API credits for new submissions, this solves the above issue of querying my own uploads, and also is an obvious road to monetization for Wigle if you ever wanted to go the SaaS route.

Raw-data, Trilateration, format, taxonomy.
I am working on creating a low energy war-driving device, and while I have found the csv upload format for wigle, I haven't managed to find the raw file format in the apps source/git.
(I do not have android development experience, so it's pretty foreign to me.)

As the CSV does not appear to include enough detail for trilateration, or other interesting processing, I assume it has to happen on my device (not low energy behaviour..).
To better assess what I am up against, I was trying to find the raw capture format / taxonomy. I understand it's an app database, but can't seem to find anything useful in my Pixel3 I am testing with.

Can someone point me towards either:
A raw data capture file. This would allow me to review the format, hopefully then I can learn about the steps that occur between initial capture, and eventual CSV format.
Alternatively, an explanation of the raw capture taxonomy. What exactly is captured, in what datatype, in what format is it uploaded to Wigle, and what preproccessing should I be doing (in addition to trilateration?)


Beginner's guide?
As someone relatively new to both stumbling for wifi, as well as Wigle, I haven't really found an explanation as to the long history of Wigle from a technical perspective.
Much of the above could likely be answered in a single spot somewhere, but also there are development decisions which I am not clear on the original thought process, such as:

Why are networks points, instead of areas?
Considering the signal strength is a radius, during trilateration, why not expand beyond a point and use an estimated area where in the network resides?

What the heck are "tiles?
I see reference in forum posts about "tiles", but haven't come across an explanation as to how big each tile is, if they are uniform across the globe, etc.


JW
1. there's an API flag for your own points that allows you to query them without a daily threshold. People who upload and demonstrate non-harvesting-bot-like behavior have higher query limits, as do folks who contact us with specific projects. We also provide KML and CSV export of data you've uploaded. Your WiGLE Wardriving client also contains a number of export options on the Database and Upload screens.

2. one of those export options is the SQLite database that includes your individual observations. Our "raw" captures on Android are limited by the contents of the WiFiManager API on Android, which doesn't include frame / packet data. If you're interested in a low-power device network detection, a WiGLE user has formulated a pretty efficient stumbler. If you want something comprehensive, Kismet is a remarkably awesome tool.

3. Networks search results and map presence are summarized by their signal-strength- weighted trilaterated centroids to assist with querying and visualization, and network "detail" queries include individual observed points (in both the website UI and in the API). If you use the basic search interface, selecting an AP will draw concentric concave hulls around them based on the network/detail query. The Advanced Search detail query provides the individual points as well!

not sure about "tiles" - perhaps the posts you're looking at are about the old raster tile-based WiGLE visualization packages (JiGLE and DiGLE, which were retired almost a decade ago?). We'd caution you that context-free posts from that far back might not be worth your time!

some places to start reading:
https://wigle.net/faq
https://wigle.net/tools
https://api.wigle.net/csvFormat.html
https://wigle.net/wiwi_settings

Cheers,

-Ark and the WiGLE team
Thanks for your reply here as well! (I hope opening multiple threads like that is the right process..)


"If you're interested in a low-power device network detection, a WiGLE user has formulated a pretty efficient stumbler. If you want something comprehensive, Kismet is a remarkably awesome tool."
--> That git is neat! I'll have a go with it and see how it goes.
--> I am not sure, is here a good place to ask a bit about Kismet? I dug into the kismet docs, and actually tried installing it on a linux machine, but it seems I lacking in my under

"If you use the basic search interface, selecting an AP will draw concentric concave hulls around them based on the network/detail query. The Advanced Search detail query provides the individual points as well!"
This is very helpful, thank you! Can you point me to the file in github where the calculation occurrs? I want to better understand the trilateration process. (Centriods is just the center point, correct?)

"We'd caution you that context-free posts from that far back might not be worth your time!"
--> Totally, I do get the impression things are still evolving!





Thank you for the URL resources as well, some of this I did come across, but will be sure to read the rest before I ask too many more questions!



I did manage to track down the sqlite export option as well, the .schema there help better paint the picture, but I notice that the CSV format doens't actually match the network .sqlite table:

CSV:

Code: Select all

MAC,SSID,AuthMode,FirstSeen,Channel,RSSI,CurrentLatitude,CurrentLongitude,AltitudeMeters,AccuracyMeters,Type

Code: Select all

sqlite> .schema CREATE TABLE network ( bssid TEXT PRIMARY KEY NOT NULL, ssid TEXT NOT NULL, frequency INT NOT NULL, capabilities TEXT NOT NULL, lasttime LONG NOT NULL, lastlat DOUBLE NOT NULL, lastlon DOUBLE NOT NULL, type TEXT NOT NULL DEFAULT 'W', bestlevel INTEGER NOT NULL DEFAULT 0, bestlat DOUBLE NOT NULL DEFAULT 0, bestlon DOUBLE NOT NULL DEFAULT 0 );
I assume the best(level|lat|lon) has to do with the weighting you mentioned for trilateration, is that correct?

As well, the sqlite shows "lasttime", is this because it's expected there will be multiple rows for each device, and "FirstSeen" can be easily calculated from there?
Two quick answers:

- The convex hull algorithm is performed on your machine, the algorithm is JS we send you, so not in public github, however we intentionally serve our .js in non-minified format, so you can literally examine how we do it in the sources of the page in question! Look for "WiGLE.silverlinings" in mapsearch.js

- the difference there is between the local best-known-point (in the network table) and the individual observations. we perform trilateration on the server, sending observations from the device as they occur. check out the other tables in SQLite. Basically, we designed things so you will be able to do whatever you want with the data on your device; what you see in the upload format and on the server is designed to feed centralized data quality, the master observation list, and the eventual trilaterated result we produce on the site / in our API / on our maps.

-ark
I see, so to be clear, the actual trilateration calculation itself doens't occur publicly, is that right?
--> I am hoping to figure this portion out, as it's the most interesting part! haha. I have hopes to eventually setup several indexpensive SoC devices to try my hand at real-time indoor location detection (Many AP, one moving target device), so I wanted to see how the trilateration occurs from raw data capture to specific point.

Let me know, if this is public, I am having trouble finding it, but it it's not something Wigle shares, I understand too.
it happens server-side - but it's not complicated!

1. filter out suspicious/garbage data. Locations that don't exist, GPS accuracy that might as well not have coordinates at all, timestamps that indicate the GPS clock was hosed, devices that can't exist.... (AND MANY MORE)
2. take a simple geospatial mean of the most recent points, then run a standard deviation on the locations vs. that mean. if the deviation is low (vs. the number of input points, mind you), congrats, you've got a candidate for a signal-strength weighted centroid!
3. if the deviation is high, then you need to do some cluster detection or sigma partitioning. depending on which you pick:
a. for partitioning, we recommend you start removing the oldest and/or most distant points, see if you're left with a reliable cluster. you can repeat this step, taking new means and sigmas until you're satisfied, or you're out of locations.
b. for cluster detection, pick a threshold, and sort the location set into proximal groups. use freshness/frequency/signal strengths/whatever you like to pick the "best" group of locations to continue
4. do a signal-strength weighted average of the final set of locations you've selected:
for location in locations {
latCentroidNumerator += (location.latitude * location.signal)
lonCentroidNumerator += (location.longitude * location.signal)
denominator += location.signal
}
latitudeCentroid = latCentroidNumerator / denominator
longitudeCentroid = lonCentroidNumerator / denominator
There's a lot of weird detail and corner cases in our implementation, compensating for device quirks, vagaries inputs, fraud detection... and our bad data detection stuff has an insane list of tests, but at the core, that's the algorithm at the center of most radio location you see.

6 posts • Page 1 of 1

Return to “Net Hugging Hardware and Software”

Who is online

Users browsing this forum: No registered users and 4 guests