The systick timer is described in "PM0056 Programming manual: STM32F10xxx/20xxx/21xxx/L1xxxx Cortex-M3 programming manual":
http://www.st.com/st-web-ui/static/active/en/resource/technical/document/programming_manual/CD00228163.pdf
Section "4.5 SysTick timer (STK)" says
"The processor has a 24-bit system timer, SysTick, that counts down from the reload value to zero, reloads (wraps to) the value in the LOAD register on the next clock edge, then counts down on subsequent clocks."
RM0008, section "7.2 Clocks" adds:
"The RCC feeds the Cortex System Timer (SysTick) external clock with the AHB clock (HCLK) divided by 8. The SysTick can work either with this clock or with the Cortex clock (HCLK), configurable in the SysTick Control and Status Register."
EDIT:
I read systick.h systick.h. SysTick is driven by HCLK (not HCLK/8). So with a 72MHz core clock, SysTick is driven at 72MHz. If you read the SysTick timer directly, that's what you'll see, a timer which runs fast enough to time instructions. Then a reload value of 71999 (SYSTICK_RELOAD_VAL is N-1 ticks) gives a 1ms 'tick'.
The speed of the general purpose timers is constrained by the clock source, it could be from the bus (see the clock tree in RM0008 Figure 8) or an external source. Reading the STM32F103B datasheet (5.3.15 TIM timer characteristics) the external clock source is limited to 36MHz, and bus is 72MHz.
However, the actual frequency fed to the advanced timer (TIM1), and the group of general purpose timers (2, 3, 4), can be different. Those clocks are set by a bunch of bits in control registers. Look at RM0008, section "7.2 Clocks" to see the possibilities.
mlundinse explains here http://forums.leaflabs.com/topic.php?id=15451#post-30954 that they are all set to the SYSCLK which is 72MHz.
I'd have to check the details for NOP. However, assuming it is a 16bit instruction, then pre-fetch should keep up okay for 'straight-line' code. Hence I'd expect a series of X NOPs to run in X clocks.
Even an inline call to something, e.g. systick_get_count(), I expect will take 2 cycles to read from memory (the systick peripheral).
Try:
int a=systick_get_count();
int b=systick_get_count();
int c=b-a;
and see what the value of c
is.