New Stumbler with some questions about the Wigle project!

The gear needed for wardriving

8 posts • Page 1 of 1
Hello!

First post!

Questions!

I see Wigle's API is pretty limited to prevent scraping, this makes sense, however, is it also limited if I want to query my own uploads?
--> If so, any plans for a credit system to the wigle API?
------> If so, have you put thought to awarding API credits for new submissions, this solves the above issue of querying my own uploads, and also is an obvious road to monetization for Wigle if you ever wanted to go the SaaS route.

Raw-data, Trilateration, format, taxonomy.
I am working on creating a low energy war-driving device, and while I have found the csv upload format for wigle, I haven't managed to find the raw file format in the apps source/git.
(I do not have android development experience, so it's pretty foreign to me.)

As the CSV does not appear to include enough detail for trilateration, or other interesting processing, I assume it has to happen on my device (not low energy behaviour..).
To better assess what I am up against, I was trying to find the raw capture format / taxonomy. I understand it's an app database, but can't seem to find anything useful in my Pixel3 I am testing with.

Can someone point me towards either:
A raw data capture file. This would allow me to review the format, hopefully then I can learn about the steps that occur between initial capture, and eventual CSV format.
Alternatively, an explanation of the raw capture taxonomy. What exactly is captured, in what datatype, in what format is it uploaded to Wigle, and what preproccessing should I be doing (in addition to trilateration?)


Beginner's guide?
As someone relatively new to both stumbling for wifi, as well as Wigle, I haven't really found an explanation as to the long history of Wigle from a technical perspective.
Much of the above could likely be answered in a single spot somewhere, but also there are development decisions which I am not clear on the original thought process, such as:

Why are networks points, instead of areas?
Considering the signal strength is a radius, during trilateration, why not expand beyond a point and use an estimated area where in the network resides?

What the heck are "tiles?
I see reference in forum posts about "tiles", but haven't come across an explanation as to how big each tile is, if they are uniform across the globe, etc.


JW
1. there's an API flag for your own points that allows you to query them without a daily threshold. People who upload and demonstrate non-harvesting-bot-like behavior have higher query limits, as do folks who contact us with specific projects. We also provide KML and CSV export of data you've uploaded. Your WiGLE Wardriving client also contains a number of export options on the Database and Upload screens.

2. one of those export options is the SQLite database that includes your individual observations. Our "raw" captures on Android are limited by the contents of the WiFiManager API on Android, which doesn't include frame / packet data. If you're interested in a low-power device network detection, a WiGLE user has formulated a pretty efficient stumbler. If you want something comprehensive, Kismet is a remarkably awesome tool.

3. Networks search results and map presence are summarized by their signal-strength- weighted trilaterated centroids to assist with querying and visualization, and network "detail" queries include individual observed points (in both the website UI and in the API). If you use the basic search interface, selecting an AP will draw concentric concave hulls around them based on the network/detail query. The Advanced Search detail query provides the individual points as well!

not sure about "tiles" - perhaps the posts you're looking at are about the old raster tile-based WiGLE visualization packages (JiGLE and DiGLE, which were retired almost a decade ago?). We'd caution you that context-free posts from that far back might not be worth your time!

some places to start reading:
https://wigle.net/faq
https://wigle.net/tools
https://api.wigle.net/csvFormat.html
https://wigle.net/wiwi_settings

Cheers,

-Ark and the WiGLE team
Thanks for your reply here as well! (I hope opening multiple threads like that is the right process..)


"If you're interested in a low-power device network detection, a WiGLE user has formulated a pretty efficient stumbler. If you want something comprehensive, Kismet is a remarkably awesome tool."
--> That git is neat! I'll have a go with it and see how it goes.
--> I am not sure, is here a good place to ask a bit about Kismet? I dug into the kismet docs, and actually tried installing it on a linux machine, but it seems I lacking in my under

"If you use the basic search interface, selecting an AP will draw concentric concave hulls around them based on the network/detail query. The Advanced Search detail query provides the individual points as well!"
This is very helpful, thank you! Can you point me to the file in github where the calculation occurrs? I want to better understand the trilateration process. (Centriods is just the center point, correct?)

"We'd caution you that context-free posts from that far back might not be worth your time!"
--> Totally, I do get the impression things are still evolving!





Thank you for the URL resources as well, some of this I did come across, but will be sure to read the rest before I ask too many more questions!



I did manage to track down the sqlite export option as well, the .schema there help better paint the picture, but I notice that the CSV format doens't actually match the network .sqlite table:

CSV:

Code: Select all

MAC,SSID,AuthMode,FirstSeen,Channel,RSSI,CurrentLatitude,CurrentLongitude,AltitudeMeters,AccuracyMeters,Type

Code: Select all

sqlite> .schema CREATE TABLE network ( bssid TEXT PRIMARY KEY NOT NULL, ssid TEXT NOT NULL, frequency INT NOT NULL, capabilities TEXT NOT NULL, lasttime LONG NOT NULL, lastlat DOUBLE NOT NULL, lastlon DOUBLE NOT NULL, type TEXT NOT NULL DEFAULT 'W', bestlevel INTEGER NOT NULL DEFAULT 0, bestlat DOUBLE NOT NULL DEFAULT 0, bestlon DOUBLE NOT NULL DEFAULT 0 );
I assume the best(level|lat|lon) has to do with the weighting you mentioned for trilateration, is that correct?

As well, the sqlite shows "lasttime", is this because it's expected there will be multiple rows for each device, and "FirstSeen" can be easily calculated from there?
Two quick answers:

- The convex hull algorithm is performed on your machine, the algorithm is JS we send you, so not in public github, however we intentionally serve our .js in non-minified format, so you can literally examine how we do it in the sources of the page in question! Look for "WiGLE.silverlinings" in mapsearch.js

- the difference there is between the local best-known-point (in the network table) and the individual observations. we perform trilateration on the server, sending observations from the device as they occur. check out the other tables in SQLite. Basically, we designed things so you will be able to do whatever you want with the data on your device; what you see in the upload format and on the server is designed to feed centralized data quality, the master observation list, and the eventual trilaterated result we produce on the site / in our API / on our maps.

-ark
I see, so to be clear, the actual trilateration calculation itself doens't occur publicly, is that right?
--> I am hoping to figure this portion out, as it's the most interesting part! haha. I have hopes to eventually setup several indexpensive SoC devices to try my hand at real-time indoor location detection (Many AP, one moving target device), so I wanted to see how the trilateration occurs from raw data capture to specific point.

Let me know, if this is public, I am having trouble finding it, but it it's not something Wigle shares, I understand too.
it happens server-side - but it's not complicated!

1. filter out suspicious/garbage data. Locations that don't exist, GPS accuracy that might as well not have coordinates at all, timestamps that indicate the GPS clock was hosed, devices that can't exist.... (AND MANY MORE)
2. take a simple geospatial mean of the most recent points, then run a standard deviation on the locations vs. that mean. if the deviation is low (vs. the number of input points, mind you), congrats, you've got a candidate for a signal-strength weighted centroid!
3. if the deviation is high, then you need to do some cluster detection or sigma partitioning. depending on which you pick:
a. for partitioning, we recommend you start removing the oldest and/or most distant points, see if you're left with a reliable cluster. you can repeat this step, taking new means and sigmas until you're satisfied, or you're out of locations.
b. for cluster detection, pick a threshold, and sort the location set into proximal groups. use freshness/frequency/signal strengths/whatever you like to pick the "best" group of locations to continue
4. do a signal-strength weighted average of the final set of locations you've selected:
for location in locations {
latCentroidNumerator += (location.latitude * location.signal)
lonCentroidNumerator += (location.longitude * location.signal)
denominator += location.signal
}
latitudeCentroid = latCentroidNumerator / denominator
longitudeCentroid = lonCentroidNumerator / denominator
There's a lot of weird detail and corner cases in our implementation, compensating for device quirks, vagaries inputs, fraud detection... and our bad data detection stuff has an insane list of tests, but at the core, that's the algorithm at the center of most radio location you see.
This is fantastically helpful, I really appreciate you taking the time to share to this detail!

I have already learned that wifi rssi is fickle and unreliable, even in relatively controlled, stationary setups.

I charted packet rssi from a single stationary device, added below mostly for sharing purposes. This obviously isn't news to you, but it does make it easier to understand why data cleanliness is at the root of this one.
Image
(Also, this looks like wireless fading, right? https://en.wikipedia.org/wiki/Fading)


==

There's a lot of weird detail and corner cases in our implementation, compensating for device quirks, vagaries inputs, fraud detection... and our bad data detection stuff has an insane list of tests,
I am hopeful that my setup, being a one-off, will keep the corner cases to a minimum. As I am working on this as a personal / pet project, fraud won't be an issue (no users).

Given my fairly consistent setup that doesn't have a gazillion user devices to worry about, and ignoring the issues related to the image above, are there any specific issues you expect I should be watching out for?



I am working in a linux environment for this project. I scanned through the `iw` source trying to understand the `iw scan` command.
Something I am unclear on with regard to wifi scanning is how channels are handled.

My understanding is that a wifi device has to be listening on the proper channel to receive frames on the same.
However, when working with Wigle, and a couple of other tools relating to wifi, it seems that channel handling has been mostly coordinated outside of the user view.

What I am unclear on, is channel hopping the proper/only solution to channel handling for drive-by wifi detection?
Forgive me, I need to add android dev to my small toolbelt of experience: I scanned through and couldn't figure out if Wigle handles this via channel hopping too, or if there is a cleaner method to listen to all channels?

If channel hopping is really the only solutions, I have been theorizing gathering up a dozen cheap usb wifi adapters that support monitor mode to facilitate a coordinated channel scanning setup, but I suspect that this may be overkill more than anything.
(Really, In theory, gathering up ~35 of these to actively monitor all channels seems like a cool project, but is likely a waste, even at <$7 each from china, lol)



You've given me a lot to tinker with as I move forward, looking forward to your thoughts!
--> I don't have anywhere else to chat wifi, seems to be a select crowd that has interest. My apologies if this is off-topic for the wigle.net forum!

JW
a lot of items to respond to here, so I'll keep this short - some of these are probably threads in their own right!

1. the graph certainly suggests multi-path propagation error *or* an external pattern of interference. It might be worth running that same experiment on different channels, as well us using a tool like WiGLE WiFi or WiFi Analyzer (https://play.google.com/store/apps/deta ... i.analyzer) to visualize frequency overlap in your location. (Note, this won't help with things like microwave ovens or portable telephones, but it's a start)

2. Different systems use different channel management options; since we use the Android API in WiGLE WiFi, we're stuck with Google's WiFiManager channel switching policies; they seem to work pretty well, but they're opaque. Some folks have hacked external WiFi adapters into their phones using a USB OTG adapter; some folks even run kismet via Kali linux on Android phones, but that's pretty intense.

Advanced packages like Kismet provide granular control; you can allocate different radios to different channels or channel groups, maximizing your portable scanning range and minimizing "timing" related loss. In the extreme case, you'll find amazing projects like d4rkm4tter's "WiFi Cactus" and WiFi Kraken:
http://palshack.org/the-hashtag-wifi-ca ... ef-con-25/
https://www.youtube.com/watch?v=8LQYgnSx3lI

It doesn't solve your channel problem, but check out "monitor mode" access to wireless adapters, if what you're most interested in is a pure listener rather than an active network participant. There's a forest of driver complications that come with this, but it's supported in advanced packages and may inform your wireless dongle selection!

3. Search this forum for some home-built multi-channel rigs. People get a lot of mileage repurposing old routers and using cheap USB dongles with a raspberry Pi or old laptop, and there's neat software out there. Creativity and statistical obsession have led to some really remarkable solutions!

8 posts • Page 1 of 1

Return to “Net Hugging Hardware and Software”

Who is online

Users browsing this forum: matt729 and 4 guests