I'm gratified to see the growing interest in the Scope Junction open-source high-performance oscilloscope (SJOSHPO) concept. You've already stepped forward with ideas ranging from ADC selection to system design and approach. Great -- let's keep going!
People have asked for block diagrams, so here's what I've got so far. There are still other options we can explore. A comment on the original blog suggested a more distributed approach -- each channel having a small daughtercard with a local AFE (analog front-end), ADC, FPGA, and memory.
This is definitely worth considering. In fact, at the Tek Scope Tour I recently attended (unfortunately, not in Europe; ah, here's the North American page), the FAE explained that this is precisely the architecture used in the MDO & MSO series (though with custom silicon -- sigh).
Scope block diagram -- stand-alone version.
The signal generator block serves as such, and also as a calibrator, and a source of fast edges for TDR functionality. Basic VNA operation should also be possible.
I've added "dither" to the AFEs to support high-res acquisition modes. Although advanced triggers will be handled by the FPGA, I think a quality analog trigger is still required. This is one area I haven't given much thought to yet. How do we correlate the trigger time with the asynchronous sampling clock?
The LA (logic analyzer) inputs are pretty straightforward I think.
The more I think about a physical control panel with lots of knobs and switches, the less value I see in it. Even though our UI Poll was overwhelmingly in favor of this old-school approach, I think it's possible to make an even better virtual control panel once we have enough screen real estate to play with, and use nice tricks like auto-hiding. Throw in basic voice-recognition (yes, that one is dear to my heart), and we'd likely have a killer UI. What do you think?
HDMI seems the best video connection method, though it can be difficult getting hold of non-HDCP chips (which don't require a license). But they're around -- I've used both.
Buffer RAM performance concerns me, as FPGA DDR controllers usually sport substandard specs. Perhaps stealing one of a Zynq CPU's DDR3 controllers would be a useful hack. Then again, looking at the performance of, say, a Kintex-7, indicates my knowledge might be out of date. Good!
The USB module block diagram is not too different:
Scope block diagram -- USB module version.
The USB connection has migrated from the "front-end" (keyboard, mouse...) to the "back-end," connecting to the host system. The processor(s) can be lower performance, and video is no longer needed.
Please continue the discussion below. Key points: architecture, module vs. all-in-one, trigger/sampling sync, and UI.
I'm starting to warm to Wolfgang's daughtercard idea, though probably without FPGA.
Either way, a high BW channel will be required to the main FPGA to support fast display modes. And either way, the daughtercards will be difficult to test without the main board. Though I suppose, with an FPGA, one could devise various test modes to exercise the hardware... Hmm.
The main board won't be so expensive once the converters are off it though, so the idea of needing one to develop daughtercards is more palatable.
In fact, almost any cheap, off-the-shelf, FPGA devboard could be used for daughtercard dev & test.
Good, a picture says more than a thousand words, thus a block diagram is worth a lot.
For the USB version of the scope, I don't see too many reasons to keep a front panel at all. Seems to counteract the concept (at least for me).
Second, I am not a big friend of overusing microcontroller softcores. For high-performance applications that's an expensive way to get computing power. I'd rather use a smaller, cheaper(!) FPGA doing just the minimum that HAS to be done in an FPGA, and keeping the rest (especially the microcontroller/microprocessor) outside. Also I think many more people are conversant in C than are good at VHDL/Verilog, so doing only the minimum necessary in dedicated logic will allow more people to productively participate. I really think in such a community project addressing the widest possible circle of potential contributors is as important (or even more important) as having a solid technical concept. I have seen too many such projects fizzle out after initial enthusiasm. (another thing I've learend form the many failed efforts that I watched and the few that succeeded is that in order to be successfule it always takes one or two main contributors taking over the lead, driving the others, distributing the work to some extent, and making the ultimate decisions).
I agree about softcores – they can be quite inefficient. But as I'm thinking a fair sized FPGA will be required here, a softcore is not going to be a big resource strain.
Anyway, I don't want to think about implementation details like that too much yet!
One detail though – it might be easier finding a hardcore or external micro with a good USB interface, than putting it on the FPGA. USB IP usually costs. Not sure if there's a controller on Opencores yet...
Yes, definitely more C than HDL people out there ;-) But writing C for a softcore would not be any different than for a separate processor, would it?
>one or two main contributors taking over the lead, driving the others, distributing the work to some extent, and making the ultimate decisions
Jonny Doin 6/12/2012 1:28:40 AM User Rank Apprentice
Re: Daughtercards
@Michael:
"One detail though – it might be easier finding a hardcore or external micro with a good USB interface, than putting it on the FPGA. USB IP usually costs. Not sure if there's a controller on Opencores yet..."
I agree that using external USB pipes will probably make more sense. The ubiquitous FT2232, for example, has 2 channels over a 480Mbps HS channel, and can drive each channel as a 25MB/s FIFO. This is enough to stream video and ancillary data to a PC/Mac. There are DLLs available that allow you to do magic like high-speed complex bit-banging over the HS interface, and ready drivers. The VHDL interfacing code is really simple to write.
"Yes, definitely more C than HDL people out there ;-) But writing C for a softcore would not be any different than for a separate processor, would it?"
Well, yes, but certainly there are a lot good VHDL enthusiasts among us :-)
One good approach would be to design the core logic as pure hardware, and make access ports on the framework for software to tap on for control data, and have a good solid datapath stream to interface to.
I really like the idea of having dynamic layers of logic defined by software (and probably synthesized/compiled/downloaded in run time) for such things as special trigger modes.
One royal pain is, for example, trying to trigger at a given frame of SPI that is NOT EQUAL to some pattern. Or triggering at a certain SPI eeprom address range, accessed for write, within a time window from another SPI frame from a saturated ADC reading.
Those very much real world trigger needs are simply not available on any scope under U$100K. A FPGA based scope should allow such sorcery tricks.
By the way, count on me for VHDL, low-level C, ARM assembly and general AFE and PCB design. Will work for food.
My biggest host software concern is dealing with the cross-platform issues for things like the USB. I don't know enough yet to guess how troublesome this will be, but that's perhaps the biggest argument in favour of an all-in-one design, to my mind.
This design has to run well on Linux, first and foremost. (I hear it's free ;-)
>Or triggering at a certain SPI eeprom address range, accessed for write, within a time window from another SPI frame from a saturated ADC reading.
Thanks for volunteering to code that!
>Will work for food.
Not sure how well my quail au framboise will ship through the mail...
VHDL? That brings up the question of standardization. Which HDL? What language on the host? Hope the arguments don't get too bitter. I won't start any. Yet.
Jonny Doin 6/12/2012 5:42:53 PM User Rank Apprentice
Re: Daughtercards
@MD:
>> "If I understand correctly re your CERN analogy, your're envisioning a cascade of more-or-less identical trigger logic blocks, separated by gate delays? That would be fun to implement in an FPGA ;-)"
Our problem is similar to what the guys at the LHC have at hand, put into perspective. The problem at the ATLAS is that a gazzilion number of collision data is generated at the detectors, and not all that data is relevant to the researchers.
The goal is to capture only the detailed data for those elusive muons, not the garden variety positrons. The data sits at the ring local memory for some 10ms, and then a torrent of new data floods the memory and erases the previous data.
The segmented trigger approach beauty is that the first ring has only to give a shallow look at the dataset and make a first-order guess at the ballpark for the current chunks. The FPGAs at the frontline can do that quick glance at the process raw speed, using superfast classification circuitry.
Only a few zillions of collisions are selected for just a firm look, at a second-order ring, that extracts just a few thousands of collisions, that at this time are just at the right bandwidth to be collected at the network storage and sent to the detailed analisys.
Applied to our case, the approach may allow using standard grade FPGAs in a similar ring structure to achieve realtime triggering at a very high screens rate.
"[...] your're envisioning a cascade of more-or-less identical trigger logic blocks, separated by gate delays?"
Not quite identical trigger blocks, but I would say a similarly provisioned processing power for each ring, with different concerns. The trigger logic at each block would focus at classification of candidate events for the currently selected trigger mode.
Let´s say the current trigger mode is to trigger on a risetime of 1000V/us+/-200V/us, with a 10%-to-90% excursion.
The first trigger ring would look at the raw timestamped ADC stream, before storage at the DDR deep buffer, and just compute a very coarse 1st derivative on small chunks of data, selecting the timestamps of candidate high-derivative slopes. A streaming list of those timestamps of the same trigger type candidates is served to the next ring, that can focus on the stored data at the DDR file and compute the slope with greater precision, using roughly the same computing time and resources for the smaller number of candidate events.
You get the picture. This approach is at the same time more economical and lower-cost.
This is something I am thinking about since I decided to do my scope, some time ago...
scopehandler 6/6/2012 10:41:04 AM User Rank Apprentice
Re: Daughtercards
Well, this block diagram is great with all required front ends. At this point I would like to add following blocks/ functions:
1. Ethernet port: This way user can have remote access and a distant user can configure the scope as per his debug techniques. This will virtually avoid physical presence at the DUT.
2. USB port: I would also like to extend the functionality of USB port with ability to direct access to system API. This way user can at any point write custom algorithms to suit his requirements.
3. Digital Video: In place of HDMI video out we can have DVI out. This will avoid lincesing of HDMI. It is also important to consider the size of display. If the display size is going to be smaller even 1920x720 video resolution will go undetected for 30" inches TV at distance more than 3ft. Though HDMI will have smaller interface compared to DVI port. Though HDMI will reach up to 1080P max with DVI can go up to 1200 single channel.
Re Ethernet - sure. I didn't want to add any more complexity than necessary at this point. If a PC is the controller, Ethernet will be there. If an embedded controller, it will be there, or be easily added. At that point, it becomes a software issue!
Not sure what you mean re the USB, but again, sounds like software ;-) If you mean, as a way to extend scope functionality, I imagine the software would be designed to make the addition of new functions as easy as possible.
DVI, for sure. Does anyone still make DVI-specific chips!? This only applies to the self-contained version, which we seem to be moving away from... But it's not dead yet! :-)
Since HDMI was derived from the DVI spec, most HDMI devices these days also support DVI.
The two big differences between DVI and HDMI are audio and security. Obviously DVI only supports video, no audio. The HDMI spec includes HDCP (High Definition Content Protection) to 'protect' the content. Most manufacturers make HDMI transmitters and receivers with and without HDCP.
Let's briefly review our thinking re the scope hardware. Slightly edited from the original blog:
1-2 GS/s, 2-4 channel (daughtercards)
Daughtercard connectors for AFEs (FMC or HSMC)
Increased resolution at lower sampling rates (e.g., at least 16 bits in the audio range)
500MHz BW
Support for an ET front-end to a few GHz (maybe)
A large buffer (at least 128MS (was 16MS)), segmentable
Decent set of advanced triggers
Logic analyzer inputs
Fast-acquisition mode (as versatile as possible)
Signal generator
Soft UI
A USB module implementation has more support right now, but I'm not going to declare it a victor quite yet. Let's see how things go when we start taking a closer look at design details. It may even be possible to make the HW universal, but I wouldn't want to persue that if it cost significant time or money. Universality could also dilute our software efforts somewhat.
The daughtercard idea is the biggest change so far. I've left some target AFE specs above, but of course, that's a function of the cards plugged into the "mainframe". But those specs also guide the required mainframe (OK, main board) performance.
Ignoring implementation and software details, does this sound like a good base spec?
In a similar don't-really-want-to-think-about-this-now-but-can't-help-myself vein, I've been pondering what a good language would be for all the code.
C is the easy & obvious choice.
Honestly though, I would prefer a "true" high-level language, as opposed to a universal assembler ;-)
I'm a fan of Python, but in its native interpreted state, it's far too slow. But... There are compiling versions of the language (e.g., PyPy). If you have experience with this, do write about it.
BTW, the Raspberry Pi uses Python as its main language :-)
MD, Looks like a great set of base specs and features for the scope.
I like the idea of a self contained scope over the USB module. But, I agree a USB scope has it advantages for the design goals.
The Raspberry Pi board is pretty cool. It does make we want to run out and grab a raspberry danish or jelly donut, though :) My only concern with the board is the Broadcom processor. Depending how low level you need to go in the design, getting datasheets for the processor could be difficult.
While still liking all-in-one, I'll confess the advantages of a USB module: e.g., It can still appear as an all-in-one by building it into a box along with your host computer board. Also lets you upgrade to faster hosts over time.
There's a 200pg "abbreviated" datasheet from Broadcom that only documents the ARM peripherals, so that's something. These secretive companies do not impress me.
jsalsburg 6/9/2012 4:14:10 AM User Rank Apprentice
FPGA
I created this Chart of the FPGA Interface based on the Reference Design.
Port A, High Speed differential 8-bit I/O Page Memory Interface to ADC Port B, High Speed differential 8-bit I/O Page Memory Interface to ADC 16-bit I/O Microcontroller Interface, I2C, Trigger, Programming Control Port, TX/RX Strobes, Flags, PROGRAM, Power On Reset, Clocks Interface Port to Front Panel USB Controller Interface Port to Computer USB Controller Interface Port to GPU Controller
jsalsburg 6/9/2012 8:19:37 PM User Rank Apprentice
Ask yourself these questions...
Ask yourself these questions...
1. What Data Rate is required from the FutueScope to a PC, to effectively analyze Signals Acquired by the Analog Inputs?
2. How am I to get the Data out fast enough and still be able to push Commands and Settings?
3. Am I going to have to Design a Completely new Display System inside the FutueScope for a Dumb Monitor, or is there a better Display System?
4. Is USB fast enough to do everything I need?
5. What will be the Latency of USB to update the PC and back to the FutueScope during Acquisition, can it be real time?
6. Am I going to have to create completely new Drivers for the PC?
7. How can I power the FutueScope, is USB Enough?
8. Am I going to have to Design a Memory System to Buffer the Acquisition Stream inside the FutueScope or can I use the Memory and Display System in the PC?
Key Features of Thunderbolt...
Dual-channel 10Gbps per port Bi-directional Dual-protocol (PCI Express and DisplayPort) Compatible with existing DisplayPort devices Daisy-chained devices Electrical or optical cables Low latency with highly accurate time synchronization Uses native protocol software drivers Power over cable for bus-powered devices
1. I'd say the WC data rate would be in fast acq mode. A generous 1k x 1k x 8 image = 8Mb. At 30fps, that's 240Mb, almost filling the 480Mb (theoretical) pipe. So, USB is enough, but true, not tons of elbow room.
3. Self-contained scope w/dumb monitor? As long as there's an HDMI chip on board, driving it is trivial. Frame buffer could be shared with the processor RAM – 1080p30 would take 5-10% memory bandwidth, or it could have its own RAM.
jsalsburg 6/9/2012 11:04:20 PM User Rank Apprentice
Re: Ask yourself these questions...
I am trying to get everyone to jump past the old way of doing things. The old way is to design-in obsolescence. To require Memory and Display inside is old school. Look at the Reference Design; No Memory or Display; it streams to a Parallel FutureBus facilitating connection to an I/O device. Why design in extra stuff when you do not have to; a pain and extra expense which hamstrings the device. The application of a bidirectional high speed serial interface eliminates Memory, Display, and Control hardware, not to mention the added cost of a significantly larger and more expensive Programmable Logic Device. The addition of Memory and Display also requires significantly more Programming and debugging. By interfacing the Acquisition system to a PC through a high speed serial interface, allows the Device to stream the Acquisition Data directly into the memory in the PC or another Memory device. Today's PCs can read and write data to their SSD at greater than 3 Giga Bits per second. This allows software to do all the work, eliminating hardware design difficulty and all the limitations that go with it. Soon, low-cost PCs will be able to read and write at greater than 6 Giga Bits per second. Any design that cannot take advantage of technology improvements (like this) is self defeating.
You've never really explained what you wanted Thunderbolt for. If I understand correctly, you're proposing that the "scope" be little more than the analog front ends, connected to a TBolt interface.
TB can handle 2GB/s, ignoring any system or processor limitations. That's *one* fast ADC channel. Seems pretty limiting.
What about fast acq modes? Could a PC process 2GB/s and create a composite frame buffer?Hmm, probably, with some careful programming.
The AFE performance still defines an "obsolescence", so to speak.
It's a cool idea, absolutely. And it would work if we're willing to give up some fast acq performance. But even for "regular" operating modes, you'll need buffer memory and an FPGA to handle the capture, then send it out the TBolt port.
But if we decide that 4ch/500MS or 2ch/1GS performance is enough, then your idea could fly!
I wonder when TBolt will appear on lower-cost machines...
Though I expect it'll be a few years before a cheap little board (future Atom, ARM, whatever), with ThunderBolt, with enough horsepower, will be available...
jsalsburg 6/10/2012 11:34:54 PM User Rank Apprentice
Re: Ask yourself these questions...
Yes, the Acquisition Stream is transfered, in real time, directly to the PC's Memory System. Perhaps it may be a better idea to create a PCI Express Device. It would probably be much less costly and easier to design.
I'd recommend sticking with a widely used, standard interface to the host. Nothing exotic. USB would be my top choice because of this. As for available bandwidth, for "normal" display all the eye needs is somewhere between 25 and 60 fps. Each channel is only a few KB of data for display unless you always want to capture MS worth of data (but you could easily get around that by transmitting only a subset of samples when zoomed wide). For fast acquisition the FPGA could accumulate an image of the screen (similar to Tek's "waveform database" of the CAS series scope) and transmit that 30-60 times a second. That would still keep bandwidth requirements moderate.
Keeping the design modular also helps with obsolenscence. It's much easier to e.g. design a new, faster (or higher-resolution) channel card if a better ADC becomes available (or the old one is no longer produced); the rest of the box doesn't need to change. For the FPGA, not using too many special custom features of the particular one used will help with porting the design to a newer FPGA (or a different vendor). A well-designed modular scope can grow performance and features over time without ever requiring a complete overhaul.
Speaking of FPGA, I think implementing the full trigger functionality in the FPGA (instead of a dedicated trigger circuit) is preferable - less circuitryt to worry about, lower cost, and allows changes lateron without change to the hardware.
I agree, keeping the design as modular as possible is an excellent idea. In addition to helping with obsolescence and future upgrades, it seems like this would allow more people to actively participate.
Those with specific experience or interests in certain areas can focus on a particular module. All that's needed is a well defined interface and some basic requirements. It creates several smaller designs that can then be combined into the final project.
The main FPGA board could be sold as a stand-alone item for general FPGA development. We scopers could ride the wave of increased production/decreased cost.
Or better yet, find a board that already exists. Unfortunately, they tend to polarize between cheap and low-end on one side, and overpriced and higher-end on the other.
Anyone know of other places like SeeedStudio, but with better capabilities (e.g., >2-layer PCB!)?
Jonny Doin 6/12/2012 12:41:06 AM User Rank Apprentice
Re: Obsolescence & modularity
Modularity is an interesting idea, for all reasons posted.
A modular approach, however, requires a solid framework of interfaces, to avoid bottlenecks. The design of the interconnect must precede the modules.
The datapath must have adequate stream bandwidth between the ADC front-end, trigger, memory and display blocks.
The module interconnects ideally should support data rates far superior than the initial design requirements, to allow future operation at higher performance levels.
Connectorization and board plan will affect total cost. Probably a fully modular system will be more costly than an integrated design, due to high-speed board-to-board interconnects. FMC connectors come to mind to provide high-speed with signal integrity.
One interesting approach to achieve cost efficiency is to design a brick with a basic FPGA engine, point-of-load supplies, FMC connectors with the interconnect fabric, essential memory and program/debug access ports. A number of these modules could be used to implement aspects of the datapath, maximizing the PCB reuse.
More specific modules, like the Analog Front-End/ADC, the DDR memory, display generator, could be specific-purpose boards.
One aspect to look for to keep PCB costs down is to try and keep above the microvia limit for the BGAs, and keep under 10 layers (8 if possible) for the majority of modules. Some PCB features, however, are not as expensive as they look at first. For example, blind vias and buried vias, if the stackup is well chosen, have a lower impact than microvias on PCB cost.
However, a good modular design will probably have a higher total cost than the equivalent integrated design, due to costly connectors, EMI and drivers, distributed power supplies, total PCB area, extra pick-and-place / assembly costs. The caveats of a fully integrated design would be the higher cost of a single failure, more difficult PCB floorplanning / routing, higher costs of re-spins.
All of this is highly speculative if grand design goals are not correctly laid out. It is one thing to achieve 5GSPS/20Mpts, and a totally different ball game to achieve 500MSPS/2Mpts.
It's the Goldilocks problem: Finding the combination of parts that's "Just Right" – not too big and expensive, not too small and limiting, etc, etc.
I don't want to go too overboard with modularity. Probably just the AFEs is my thinking. But definitely agree – the datapaths have to be considered throughout the system, leaving headroom wherever possible – both for miscalculations, and future expansion.
I've used HSMC, not FMC, but I gather the concept is the same. And, we could mix'n'match: Use, say, a Xilinx FPGA with Altera HSMC connectors, if that turned out to be the best of both worlds! Fun, wow.
Jonny Doin 6/12/2012 11:37:49 PM User Rank Apprentice
Re: Obsolescence & modularity
@MD:
>> “I don't want to go too overboard with modularity. Probably just the AFEs is my thinking. But definitely agree – the datapaths have to be considered throughout the system, leaving headroom wherever possible – both for miscalculations, and future expansion."
We are at the same page here. The modular AFEs allow us to evolve the logic and trigger concepts using a, say, 500MSPS AFE, that fits under $300 budget, and then gear up to higher stream rates.
Anyway, regardless of physical modularization, the architecture of the streams, interfaces, DDR file, internal bus interconnects, external data ports, must be laid out before module segmentation.
- Jonny
I'm thinking a lower sampling rate AFE really will be the best "first cut". Overall BW can still be 500MHz or more, but some amount of repetitive sampling will be required.
This will reduce the $ burden, interest more people, and give us valuable experience. By the time we get to building the 1 or 2 or 5GS AFE, the technology will be that much better and/or cheaper. Yeah, I like it.
I haven't pulled the trigger just yet, but I am seriously considering designing a low end USB scope that would use a minimal set of hardware such as no amps with five ADC directly on the processor chip (four channels plus trigger). The processor chip has tons of processing power, but not much memory. It is also a little short on I/O so adding logic analyzer functions would require a low end FPGA. I think a USB connected unit could use well under $100 worth of parts.
The ADCs are unusual as they don't have a set sample period. Each ADC is a counter driven by a voltage to frequency converter running between 2 and 5 GHz. You set the sample rate by reading the counter at fixed periods and subtracting adjacent samples. I'm not sure if it will be better to use lower sample rates or to always sample at the max rate and for slower sweep speeds use signal processing to combine the results. At higher sample rates the ADC resolution suffers but at slower sample rates you can get very high resolution. I expect signal processing will get similar resolution for the high sample rate when using slower sweep speeds. In essense the bandwidth changes when you change the sweep rate.
The processor also has five fast DAC which can be used as signal generators or cal signals.
This processor chip shines when you need to do processing on the collected data. There are 144 async processors in an array to be used much like the LUTs in an FPGA. So I like to call it an FPPA, Field Programmable Processor Array. I think this chip has some real potential in this app.
The only part I have not yet figured out is the interface to a display. I'm thinking of a USB interface to a PC or tablet, etc. But I'm not sure just how much data this should provide or how it should be formatted. I'm thinking at max rate one channel generates 100 MB/S which is far too fast for high speed USB. Obviously the data stream needs to be processed so only the displayed data is sent to the host. I'm not sure how best to reduce that data rate. I'm thinking multiple sweeps would need to be combined into a display frame and pushed out at some rate that the comm channel can deal with, like maybe 20 fps.
Very interesting! I assume this project would be mostly for fun/education, given there are already a number of low-cost products of that nature (though maybe you're thinking this chip will allow for much better performance?)
I've come across a few processor arrays in my time (in fact, was very interested in the Transputer way back when). Which is the one you're describing? Sounds expensive – if the processors are of significant horsepower.
Interesting ADC implementation. If I grok it correctly, 8-bit resolution would result in a sample rate around 10MS/s.
Yes, 10 MSPS would equal about 8 bits. This also gives capability of going to higher rates with lower resolution.
The chip is the GreenArrays GA144 with 144 processors running at up to 700 MIPS each. Sold in small quantities for $20 but somewhat less at higher quantities. An external ROM and RAM is needed for most applications. The device will boot from an SPI Flash device or through a serial port.
I don't see this so much as a toy but as a low end, starter project for this chip. Maybe that is what you mean by "educational". Ultimately I want to use it in a software defined radio app. Most SDR apps provide for SNR gain as you narrow the bandwidth in the processing chain. So low resolution IF samples can provide adequate SNR by the time the processing produces a baseband signal. Unfortunately this same technique doesn't work for oscopes unless you limit the bandwidth at slower sweep rates.
Yes, what are the odds!!? Of course they are similar, they were designed by the same person. The Intellasys chips were the second(ish) generation IIRC and the GreenArrays devices are the third generation. 700 MIPS processors running asynchronously with a rectangular grid of comms can be a powerful tool, but you can't think of them as 144 TMS type DSPs or even A8 ARMs as they each have very little memory and the only access to the rest of the world is though the four comms channels to the adjacent nodes or the periphery of the chip. This is why, as I said earlier, I think of them as FPPAs (Field Programmable Processor Arrays).
I think there are some aspects of typical thinking that need to abandoned to use the GA144 effectively. The first is that you need to use all the CPUs efficiently. To compare, when using an FPGA do you really care if a 4 input LUT is used to implement an inverter wasting all the other "gates" in the LUT, the FF and the rest of the hidden logic and wiring... not really, because you have too much to think about already! If a few gates are not 100% optimized you just don't worry about it. Likewise you shouldn't worry if any single GA144 node is only used to pass data to adjacent nodes or if it is otherwise used for a trivial task like a UART. As long as you have enough nodes and enough horsepower to get the job done it is a good design. Many people can't seem to get past this. Since they can't see how to use 100% of every node it must not be a good chip.
BTW, the main language for this chip is colorForth rather than eForth. There is an interpreter that lets users run a version of "standard" Forth using external memory. I'm not sure how fast it is though.
I don't want to hijack the forum to discuss the GA144, I just wanted to describe what I expect to be doing in the coming months and see if we might find synergy.
I'm guessing a synergy won't be possible, but as you're more familiar with the chip, do keep the "SJOSHPO" spec in mind, and if you think it could work, let us know. I for one will be very interested to see what you do with the chip, regardless.
Among other things, I/O would likely be an issue, as we likely need to support, say, 8 LVDS pairs per channel - 72 pins right there.
The "back end" (or would that be the "front end"?), I mean the user interface, will be similar in both cases. The hardware portion will provide the basic data stream and likely provide some degree of signal processing which will be distinct in our two approaches. But once you have the data stream the user interface programming will be separate and likely will be simlar in the two approaches. Regardless of whether the user interface is embeded or on a tablet, phone or PC, it will work the same way.
I did a search and didn't find what I would call a "SJOSHPO" spec. I found a block diagram and a "Further Details" post as well as an "octopus" post and the original "Let's Design a Scope" post, but nothing like a "proper" spec. Is the spec just the assembled comments to all the posts? Seems each post has it's own comments and they are largely all over the map. Has anything like a concensus started to emerge?
Wale Bakare 6/24/2012 10:26:13 AM User Rank Sheriff
Re: Obsolescence & modularity
@MD it may look difficult but synergy could be achieved if well project managed - who's ready to do what. Is a good thing if succefully achieved - high performance.
RFIDguru 6/11/2012 10:52:25 AM User Rank Apprentice
Re: Link Bandwidth
I may like the Thunderbolt idea in 2016, when there's TB 2.0. Then it will be fast enough to handle 4 channels @ 1 GSPS. Anno 2012, TB is Apple only, to slow, to new for a scope we need to build build with 2012 technology.
I'd prefer to use USB. The Cypress CYUSB3014 handles USB 3.0 and is backwards compatible with USB 2.0. I find USB 2.0 speed satisfactory for a scope (Picoscope user). You'll need an FPGA and some (internal?) memory anyway.
jsalsburg 6/10/2012 7:15:25 PM User Rank Apprentice
Re: Ask yourself these questions...
How can I put this gracefully? Make the leap to the Future. You are assuming that there is a limitation to Read/Write Speed of Thunberbolt that will somehow cause it to limit the performance of the Device. Any System that writes to Memory and Displays the result will have the same limitations, whether the Memory is physically inside the device or somewhere else. What you want in a Future Design is a way to apply a modern Operating System to analyze and display what is in memory. Because Thunderbolt is so fast, it can write directly to Memory in the PC. Real time Acquisition directly to the PC is now a reality. If you create a device that has its own Memory and Display, not connected to a PC (oldschool), it must have its own Operating System with all the accompanying troubles of development. This will limit its future. It is now possible to alleviate the hardware restrictions inherent in high speed instrumentation; "Throughput." All the questions and tribulations of misunderstanding the basic idea of serial interconnect are because those interested in this Project have not done their homework. Study the presentations in Intel's Web Site on Thunderbolt. This will cast out all your doubts about this concept; seeing is believing. Study the literature, watch the Videos, and make the leap to the new concept. Conceptual Blockbusting is always difficult, Human Beings operate under the Mammalian Imperative, clinging to the familiar. Remember when RS-232, and Parallel Printer Ports were the only way to interface a new Device to a PC, that was a little more than 10 years ago. USB is only an interim step in Connectivity. In the new age of 2k, 4k and 8k Video where Pentabyte projects are giving way to Exabyte Servers, the problem facing us is not how fast the data can be transferred, it is how to manage the huge files it creates. The issue is Project Management (of Terabyte Files), not Transfer rates. File Management is not a Problem of the Hardware Design of the Device, it will, however, be a Development issue with the App that makes the new Device possible on the PC.
FYI, I'm typing this on a machine with a Thunderbolt connector!
The TB data protocol is PCIe. This is a complex, packetized protocol. It's not like you have a direct line to memory. In fact, TB's 2GB/s BW is nowhere near a modern direct memory channel of perhaps 8GB/s. And a direct channel is much more "random access" than a PCIe packet!
Unfortunately, our design has to deal with reality, not pie-in-the-sky dreams of some wondrous future IO.
No "OS" is needed on the module end. Just FPGA logic that *WILL* have direct and random local memory access, at pretty much whatever speed is required.
I'm sorry we can't design a scope that will last through the next 100 years of technology upgrades, just a box that returns a great price/performance ratio within existing technology.
The block diagram is very clear and I am sure we can get started. My motivation in this project will be to get enough ideas to make an embedded system that will convert my iPad into a scope. People, let us get this started. A very popular scope that works with iPad through USB plugin will be very cool.
While I don't want to get into detailed design or parts selection yet, I sometimes can't stop myself from doing a bit of searching.
So, for example, should we prefer a first cut of an AFE daughtercard to NOT use a pricey 1-3GS ADC, a very serviceable module could be had with one of these $40 Analog Devices chips for example:
The AD9286 is a 2-ch 250MS ADC, interleavable to 1-ch 500MS. BW is 500MHz, so our BW goal would be met for repetitive signals.
The AD9484 is a 1-ch only, 500MS ADC with a 1GHz BW.
Just wanted to give hope to those who may be discouraged by the ~$300+ price tag of a faster ADC! :-)
I assumed that an analog trigger circuit would still be required, despite having advanced triggers implemented in the FPGA.
Maybe not.
If we digitally process the pair of samples that straddle the trigger threshold, and compute the exact sub-sample trigger time using linear interpolation, that time stamp can be attached to the capture, allowing it to be aligned to other captures.
Nobody said this wasn't going to get complicated :-o But, the necessary ideas will slowly gel.
Jonny Doin 6/11/2012 11:49:30 PM User Rank Apprentice
Re: Digital trigger
@Michael:
The trigger subsystem is probably more than half the logic for the scope. Actually, for a usable realtime scope, the trigger logic is probably the most important feature after the analog front end and ADC.
We often see this situation: a great scope that has everything but that specific trigger that you need.
The trigger circuit must analyze the raw ADC signal in search of trigger points, but that often requires a much faster logic than the sampling rate front end / memory back end. Assuming that we are operating at the bulk logic speed limit for the sampling and memory, the trigger can be very difficult to implement.
One alternative for that problem can be what they do at CERN, in the LHC collision detectors like ATLAS. To maximize the chance of capturing relevant events, several rings of ultrafast logic layers deal with speculative filtering of chunks of raw data, trying to choose the most likely chunks that are candidates to a trigger, by focussing on a coarse aspect of the data that is a simple pattern that may be worth a second look.
After the first ring, a second logic layer has much more time available to recategorize the few selected chunks and choose the ones that really constitute a trigger. Any number of pipelined layers may be implemented, to achieve a very high throughput trigger logic, using the same logic core speed as the front-end.
The result is a configurable, flexible and high-performance trigger core, within the same FPGA speed grade.
Yes, I agree – the trigger is going to be very important to get right, and not easy either!
If I understand correctly re your CERN analogy, your're envisioning a cascade of more-or-less identical trigger logic blocks, separated by gate delays? That would be fun to implement in an FPGA ;-)
I guess this approach woud still require a physical analog comparator to generate the trigger signal (or maybe use an LVDS receiver on the FPGA, though I'd question the quality and ease of that!). No big deal, but a pure digital approach has obvious appeal.
I like having all these ideas floating about our brains in "superposition". At some point, the state will collapse, and we'll have our scope. Quantum computers – who needs'em.
To save this item to your list of favorite Scope Junction content so you can find it later in your Profile page, click the "Save It" button next to the item.
If you found this interesting or useful, please use the links to the services below to share it with other readers. You will need a free account with each service to share an item via that service.