joe_c - what is it you are trying to do?
Unless you are familiar with Thumb2 assembler, you will be spending a significant amount of time trying to produce better code than the compiler, and may discover that it is very hard, or you can't. ARM was part of the wave of RISC processors which were designed to be targeted by high-level language compilers (typified by MIPS).
I am a big fan of Jon L Bentley's Writing Efficient Programs:
http://www.amazon.com/Writing-Efficient-Programs-Prentice-Hall-Software/dp/013970244X
(I used to teach undergrad and post grad CS & Software Engineering)
Bentley suggest a six-layer model to consider efficiency, three are software and three hardware.
Each layer typically yields a 10x efficiency gain.
The effectiveness of translation of program code to binary is only one of the three software layers, and the compiler is likely to be pretty good (the C compiler has been developed and improved for many, many years). So it is often easier to go look at a different layer to get a significant efficiency improvement.
Directly accessing the hardware is straightforward if you are accustomed to C.
There are a bunch of helpful pre-defined macros in the libmaple gpio.h header which correspond to the General Purpose I/O (GPIO) ports.
So you could write:
(unsigned int*)0x40010800+0x10 = 0x00000010; /* set pin A5 via GPIO port A, BSRR register */
or
GPIOA_BASE->BSRR = 0x00000010; /* set pin A5, which is the LED pin, via GPIO port A, BSRR register */
Both C statements compile to the same instructions. It is a short sequence because the compiler is dealing with constants which it can calculate for you at compile time.
Once the value of GPIOA_BASE->BSRR
and 0x00000010
is loaded into registers,
the GPIOA_BASE->BSRR = 0x00000010
is one instruction.
So toggling pins with:
GPIOA_BASE->BSRR = 0x00000010; /* set pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00100000; /* clear pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00000010; /* set pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00100000; /* clear pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00000010; /* set pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00100000; /* clear pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00000010; /* set pin A5, which is the LED pin, via GPIO port A, BSRR register */
GPIOA_BASE->BSRR = 0x00100000; /* clear pin A5, which is the LED pin, via GPIO port A, BSRR register */
Should generate a sequence which is as quick as the hardware can go. (This technique lets you toggle some or all 16 pins in a port)
I'd recommend reading the libmaple source code, which is Open Source and available, and use some of its techniques to make your life easier.
You can get the source at https://github.com/leaflabs/libmaple
These threads give some concrete advice on how to do I/O quickly
http://forums.leaflabs.com/topic.php?id=517
http://forums.leaflabs.com/topic.php?id=718
http://forums.leaflabs.com/topic.php?id=737
http://forums.leaflabs.com/topic.php?id=774
I have got 18MHz on my oscilloscope, using that somewhat ugly repetitive code, and 12MHz with more normal looking stuff.
Get the latest copy of the STM32F103 manual "RM0008 Reference manual", currently:
http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/REFERENCE_MANUAL/CD00171190.pdf
It describes every peripheral in loving detail, and also gives an overall memory map.
If you want to change or toggle pins quickly, have a look at Section 9 "General-purpose and alternate-function I/Os (GPIOs and AFIOs)". Each pin of each GPIO port has a memory address, so you can read/write in a single 2 cycle instruction. You'll need to lookup 'bit-banding. and may need an ARM technical reference manual.
Also get a copy of the STM32F103x8/STM32F103xB Datasheet if you have a standard Maple:
http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/DATASHEET/CD00161566.pdf
of the STM32F103xC/xD/xE Datasheet if you have a RET6:
http://www.st.com/internet/com/TECHNICAL_RESOURCES/TECHNICAL_LITERATURE/DATASHEET/CD00191185.pdf
These gives the memory map of the STM32F103 that is on your board.
Please post questions if you'd like some more discussion.
(full disclosure: I am not a member of LeafLabs staff)